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SARS VIRUS NUCLEOTIDE AND AMPsfO ACID 
SEQUENCES AND USES THEREOF 

Field of the Invention 
The invention is in the field of virology. More specifically, the invention is in 
the field of coronaviruses. 

Background of the Invention 
Severe acute respiratory syndrome (SARS), a worldwide outbreak of atypical 
pneumonia with an overall mortality rate of about 3 to 6%, has been attributed to a 
coronavirus following tests of causation according to Koch's postulates, including 
monkey inoculation (R. Mxmch, Microbes Infect 5, 69-74, Jan, 2003). The 
coronaviruses are members of a family of enveloped viruses that replicate in the 
c>i:oplasm of animal host cells (B. N. Fields et al., Fields virology, Lippincott Williams 
& Wilkins, Philadelphia, 4*^ ed., 2001). They are distinguished by the presence of a 
single- stranded plus sense RNA genome, approximately 30 kb in length, that has a 5' 
cap structure and 3 ' polyA tract. Hence the genome is essentially a very large mRNA. 
Upon infection of an appropriate host cell, the 5 '-most open reading frame (ORF) of 
the viral genome is translated into a large polyprotein that is cleaved by viral-encoded 
proteases to release several nonstmctural proteins including an RNA-dependent RNA 
polymerase (Pol) and an ATPase helicase (Hel). These proteins in turn are responsible 
for replicating the viral genome as well as generating nested transcripts that are used in 
the synthesis of the viral proteins. The mechanism by which these subgenomic mRNAs 
are made is not fully understood, however transcription regulating sequences (TRSs) at 
the 5 'end of each gene may represent signals that regulate the discontinuous 
transcription of subgenomic mRNAs (sgmRNAs). The TRSs include a partially 
conserved core sequence (CS) that in some coronaviruses is 5'-CUAAAC-3'. Two 
major models have been proposed to explain the discontinuous transcription in 
coronaviruses and arterioviruses (M.M.CXai, D. Cavanagh, Adv Virus Res, 
48,1(1997); S. G. Sawicld, D.L. Sawicki,Adv.Exp. Med BioL440,2 15(1998)). The 
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discovery of transcriptionally active, subgenomic-size minus strands containing the 
antileader sequence and transcription intennediates active in the synthesis of naRNAs 
(D. L. Sawicki et al., J. Gen Virol 82,386 (2001); S. G. Sawicki, D.L. Sawicki, J. Virol. 
64,1050 (1990); M, Schaad, R.S J. Baric,J. Virol.68,8 169(1 994); P. B. Sethna et aL, 
5 Proc. Natl. Acad. Sci. U.S.A. 86,5626 (1989) ) favors the model of discontinuous 

transcription during the minus strand synthesis(S. G. Sawicki, D.L. Sawicki, Adv.Exp. 
Med BioL440,215(1998)). 

The coronaviral membrane proteins, including the major proteins S (Spike) and 
M (Membrane), are inserted into the endoplasmic reticulum Golgi intemiediate 

10 compartment (ERGIC) while full length replicated RNA (+ strands) assemble with the 
N (nucleocapsid) protein. This RNA-protein complex then associates with the M 
protein embedded in the membranes of the ER and vims particles form as the 
nucleocapsid complex buds into tiie ER. The virus then migrates through the Golgi 
complex and eventually exits the cell, likely by exocytosis (B. N. Fields et al.. Fields 

15 virology, Lippincott Williams & Wilkins, Philadelphia, 4* ed., 2001). The site of viral 
attachment to the host cell resides within the S protein. 

The coronaviruses include a large number of viruses that infect different animal 
species. The predominant diseases associated with these viruses are respiratory and 
enteric infections, although hepatic and neurological diseases also occur with some 

20 vimses. Coronavirases are divided into three serotypes, Types I, II and III. 

Phylogenetic analysis of coronaviras sequences also identifies three main classes of 
these vimses, corresponding to each of the three serotypes. Type II coronaviruses 
contain a hemagglutinin esterase (HE) gene homologous to that of Influenza C vims. It 
is presumed that the precursor of the Type II coronavimses acquired HE as a result of a 

25 recombination event within a doubly infected host cell. 

In view of the rapid worldwide dissemination of SARS, which has the potential 
of creating a pandemic, along with its alarming morbidity and mortality rates, it would 
be useful to have a better understanding of this coronavims agent at the molecular level 
to provide diagnostics, vaccines, and therapeutics, and to suppoit public health control 

30 measures. 
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Simimarv of the Invention 
In general, the invention provides the genomic sequence of a novel coronavirus, 
the SARS virus, and provides novel nucleic acid molecules encoding novel proteins 
that may be used, for example, for the diagnosis or therapy of a variety of SARS vnus- 
5 related disorders. 

In one aspect, the invention provides a substantially pure SARS virus nucleic 
acid molecule or fragment thereof, for example, a genomic RNA or DNA, cDNA, 
synthetic DNA, or mRNA molecule. In some embodiments, the nucleic acid molecule 
includes a sequence substantially identical to any of the sequences of SEQ ID NOs: 1- 

10 13, 15-18, 20-30, 90-159, 208, 209. In some embodiments, the nucleic acid molecule 
includes a sequence from SEQ ID NO: 1, SEQ ID NO:2, or SEQ ID NO: 15 or a 
fragment of these sequences. In alternative embodiments, the nucleic acid molecule 
may include a sequence substantially identical to SEQ ID NO: 1, SEQ ID NO:2, or 
SEQ ID NO: 15, or a fragment thereof. In altemative embodiments, the nucleic acid 

15 molecule may include a s2m motif (for example, a s2m sequence substantially identical 
to any of the sequence of SEQ ID NOs: 16, 17, and 18), a leader sequence (for 
example, a sequence substantially identical to the sequence of SEQ ID NO: 3), or a 
transcriptional regulatory sequence (for example, a sequence substantially identical to 
any of the sequence of SEQ ID NOs: 4-13 and 20-30). In altemative embodiments, the 

20 nucleic acid molecule includes a sequence substantially identical to any of the 

sequences of nucleotides 265-13,398; 13,398-21,485; 21,492 - 25,259; 25,268 - 
26,092; 25,689 - 26,153; 26,1 17 - 26,347; 26,398 - 27,063; 27,074 - 27,265; 27,273 - 
27,641; 27,638 - 27,772; 27,779 - 27,898; 27,864 - 28,1 18; 28,120 - 29,388; 28,130 - 
28,426; 28,583 - 28,795; and 29,590 - 29,621 of SEQ ID NO: 15. In altemative 

25 embodiments, the nucleic acid molecule may encode a polyprotein or a polypeptide. In 
altemative embodiments, the invention provides a nucleic acid molecule including a 
sequence complementary to a SARS viras nucleotide sequence. 

In an altemative aspect, the invention provides a substantially pure SARS viras 
polypeptide or fragment thereof, for example, a pol3^rotein, glycoprotein (for example, 

30 a matrix glycoprotein that may include a sequence substantially identical to the 
sequence of SEQ ID NO: 34), a transmembrane protein (for example, a 
multitransmembrane protein, a type I transmembrane protein, or a type II 



wo 2004/096842 PCT/CA2004/000626 

4 

transmembrane protein), *a RNA binding protein, or a viral envelope protein. In 
altemative embodiments, the invention provides a replicase la protein, replicase lb 
protein, a spike glycoprotein, a small envelope protein, a matrix glycoprotein, or a 
nucleocapsid protein. In altemative embodiments, the invention provides a nucleic acid 
5 molecule encoding a SARS virus polypeptide* In altemative embodiments^ the SARS 
virus polypeptide includes an identifiable signal sequence (for example, a signal 
sequence substantially identical to the sequence of SEQ ID NOs: 76 or 85), a 
transmembrane domain (for example, a transmembrane domain substantially identical 
to any of the sequences of SEQ ID NOs: 77-86), a transmembrane anchor, a 

10 transmembrane helix, an ATP-binding domain, a nuclear localization signal, a 

hydrophilic domain, (for example, a hydrophilic domain substantially identical to the 
sequence of SEQ ID NOs: 87), or a lysine-rich sequence (for example, a sequence 
substantially identical to the sequence of SEQ ID NO: 14). In altemative embodiments, 
the SARS virus polypeptide may include a sequence substantially identical to any of 

15 the sequences of SEQ ID NOs: 14, 33-36, 64-74, and 76-87. 

In altemative embodiments, the invention provides a vector (for example, a 
gene therapy vector or a cloning vector) including a SARS virus nucleic acid molecule 
(for example, a molecule including a sequence substantially identical to any of the 
sequences of SEQ ID NOs: 1-13, 15-18, 20-30, 90-159, 208, 209), or ahost cell (for 

20 example, a mammalian cell, a yeast, a bacterium, or a nematode cell) including the 
vector. 

In altemative embodiments, the invention provides a nucleic acid molecule 
having substantial nucleotide sequence identity (for example, 30%, 40%, 50%, 60%, 
70%, 80%, 90% or 100% complementarity) to a sequence encoding a SARS vims 

25 polypeptide or fragment thereof, for example where the fragment includes at least six 
amino acids, and where the nucleic acid molecule hybridizes under high stringency 
conditions to at least a portion of a SARS vims nucleic acid molecule. 

In altemative embodiments, the invention provides a nucleic acid molecule 
having substantial nucleotide sequence identity (for example, 30%, 40%, 50%, 60%, 

30 70%, 80%, 90% or 100% complementarity) to a SARS virus nucleotide sequence, for 
example where the nucleic acid molecule includes at least ten nucleotides, and where 
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the nucleic acid molecule hybridizes under high stringency conditions to at least a 
portion of a S ARS virus nucleic acid molecule. 

In alternative embodiments, the invention provides a nucleic acid molecule 
comprising a sequence that is antisense to a SARS virus nucleic acid molecule, or an 
5 antibody (for example, a neutralizing antibody) that specifically binds to a SARS virus 
polypeptide. 

In alternative embodiments, the invention provides a method for detecting a 
SARS epitope, such as a virion or polypeptide in a sample, by contacting the sample 
with an antibody that specifically binds a SARS epitope, such as a virus pol3rpeptide, 

10 and determining whether the antibody specifically binds to the polypeptide. In 

alternative embodiments, the invention provides a method for detecting a SARS virus 
genome, gene, or homolog or fragment thereof in a sample by contacting a SARS virus 
nucleic acid molecule, for example where the nucleic acid molecule includes at least 
ten nucleotides, with a preparation of genomic DNA fi:om the sample, under 

1 5 hybridization conditions providing detection of DNA sequences having nucleotide 

sequence identity to a SARS virus nucleic acid molecule. In alternative embodiments, 
the invention provides a method of targeting a protein for secretion from a cell, by 
attaching a signal sequence fi*om a SARS virus polypeptide to the protein, such that the 
protein is secreted fi-om the cell. 

20 In alternative aspects, the invention provides a method for eliciting an immune 

response in an animal, by identifying an animal infected with or at risk for infection 
with a SARS virus and administering a SARS vims polypeptide or firagment thereof or 
fi"agment thereof, or administering a SARS virus nucleic acid molecule encoding a 
SARS virus polypeptide or fragment thereof to the animal. In alternative embodiments, 

25 the administering results in the production of an antibody in the mammal, or results in 
the generation of cytotoxic or helper T-lymphocytes in the mammal. 

In alternative embodiments, the invention provides a kit for detecting the 
presence of a SARS vims nucleic acid molecule or polypeptide in a sample, where the 
kit includes a SARS vims nucleic acid molecule, or an antibody that specifically binds 

30 a SARS virus polypeptide. 

In alternative aspects the invention provides a method for treating or preventing 
a SARS vims infection by identifying an animal (e.g., a human) infected with or at risk 
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for infection with a SARS virus, and administering a S ARS virus nucleic acid molecule 
or polypeptide, or administering a compound that inhibits pathogenicity or replication 
of a SARS virus, to the animal. In alternative embodiments, the invention provides the 
use of a SARS virus nucleic acid molecule or polypeptide for treating or preventing a 
5 SARS virus infection. 

In alternative aspects the invention provides a method of identifying a 
compound for treating or preventing a SARS virus infection, by contacting sample 
including a SARS virus nucleic acid molecule or contacting a SARS vims polypeptide 
with the compound, where an increase or decrease in the expression or activity of the 
10 nucleic acid molecule or the polypeptide identifies a compound for treating or 
preventing a SARS virus infection. 

In altemative aspects the invention provides a vaccine (e.g., a DNA vaccine) 
including a SARS virus nucleic acid molecule or polypeptide. 

In altemative aspects the invention provides a microarray including a plurality 
15 of elements, wherein each element includes one or more distinct nucleic acid or amino 
acid sequences, and where the sequences are selected from a SARS virus nucleic acid 
molecule or polypeptide, or a antibody that specifically binds a SARS virus nucleic 
acid molecule or polypeptide. 

In altemative aspects the invention provides a computer readable record (e.g., a 
20 database) including distinct SARS virus nucleic acid or amino acid sequences. 

A "SARS virus" is a virus putatively belonging to the coronavirus family and 
identified as the causative agent for sudden acute respiratory syndrome (SARS). A 
SARS virus nucleic acid molecule may include a sequence substantially identical to the 
nucleotide sequences described herein or firagments thereof A SARS virus polypeptide 
25 may include a sequence substantially identical to a sequence encoded by a SARS virus 
nucleic acid molecule, or may include a sequence substantially identical to the 
polypeptide sequences described herein, or fragments th^eof 

A compound is "substantially pure" when it is separated from the components 
that naturally accompany it. Typically, a compound is substantially pure when it is at 
30 least 60%, more generally 75% or over 90%, by weight, of the total material in a 

sample. Thus, for example, a polypeptide that is chemically synthesized or produced 
by recombinant technology will be generally be substantially free from its naturally 
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associated components. A nucleic acid molecule may be substantially pure when it is 
not immediately contiguous with (i.e., covalently linked to) the coding sequences with 
which it is nomially contiguous in the naturally occurring genome of the organism from 
which the DNA of the invention is derived. A nucleic acid molecule may also be 
5 substantially pure when it is isolated from the organism in which it is normally found. 
A substantially pure compound can be obtained^ for example, by extraction from a 
natural source; by expression of a recombinant nucleic acid molecule encoding a 
polypeptide compound; or by chemical synthesis. Purity can be measured using any 
appropriate method such as column chromatography, gel electrophoresis, HPLC, etc. 

10 A "substantially identical" sequence is an amino acid or nucleotide sequence 

that differs from a reference sequence only by one or more conservative substitutions, 
as discussed herein, or by one or more non-conservative substitutions, deletions, or 
insertions located at positions of the sequence that do not destroy the biological 
fiinction of the amino acid or nucleic acid molecule. Such a sequence can be at least 

15 10%, 20%, 30%, 40%, 50%, 52.5%, 55% or 60% or 75%, or more generally at least 
80%, 85%, 90%), or 95%>, or as much as 99% or 100% identical at the amino acid or 
nucleotide level to the sequence used for comparison using, for example, the Align 
Program (Myers and Miller, CABIOS, 1989, 4:1 1-17) or FASTA. For polypeptides, 
the length of comparison sequences may be at least 4, 5, 10, or 15 amino acids, or at 

20 least 20, 25, or 30 amino acids. In alternate embodiments, the length of comparison 

sequences may be at least 35, 40, or 50 amino acids, or over 60, 80, or 100 amino acids. 
For nucleic acid molecules, the length of comparison sequences maybe at least 15, 20, 
or 25 nucleotides, or at least 30, 40, or 50 nucleotides. In alternate embodiments, the 
length of comparison sequences may be at least 60, 70, 80, or 90 nucleotides, or over 

25 1 00, 200, or 500 nucleotides. Sequence identity can be readily measured using publicly 
available sequence analysis software (e.g.. Sequence Analysis Software Package of the 
Genetics Computer Group, University of Wisconsin Biotechnology Center, 1710 
University Avenue, Madison, Wis. 53705, or BLAST software available from the 
National Library of Medicine, or as described herein). Examples of usefiil software 

30 include the programs Pile-up and PrettyBox. Such software matches similar sequences 
by assigning degrees of homology to various substitutions, deletions, insertions, and 
other modifications. 
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Alternatively, or additionally, two nucleic acid sequences may be "substantially 
identical" if they hybridize under high stringency conditions. In some embodiments, 
high stringency conditions are, for example, conditions that allow hybridization 
comparable with the hybridization that occurs using a DNA probe of at least 500 
5 nucleotides in length, in a buffer containing 0.5 M NaHP04, pH 7.2, 7% SDS, 1 mM 
EDTA, and 1% BSA (fraction V), at a temperature of 65''C, or a buffer containing 48% 
formamide, 4.8x SSC, 0.2 M Tris-Cl, pH 7,6, Ix Denhardt's solution, 10% dextran 
sulfate, and 0.1% SDS, at a temperature of 42°C. (These are typical conditions for high 
stringency northem or Southern hybridizations.) Hybridizations may be carried out 

10 over a period of about 20 to 30 minutes, or about 2 to 6 hours, or about 10 to 15 hours, 
or over 24 hours or more. High stringency hybridization is also relied upon for the 
success of numerous techniques routinely performed by molecular biologists, such as 
high stringency PGR, DNA sequencing, single strand conformational polymorphism 
analysis, and in situ hybridization. In contrast to northem and Southem hybridizations, 

15 these techniques are usually performed with relatively short probes (e.g., usually about 
16 nucleotides or longer for PGR or sequencing and about 40 nucleotides or longer for 
in situ hybridization). The high stringency conditions used in these techniques are well 
known to those skilled in the art of molecular biology, and examples of them can be 
found, for example, in Ausubel et al., Current Protocols in Molecular Biology, John 

20 Wiley & Sons, New York, N.Y., 1998, which is hereby incorporated by reference. 

The terms "nucleic acid" or "nucleic acid molecule" encompass both RNA (plus 
and minus strands) and DNA, including cDNA, genomic DNA, and synthetic (e.g., 
chemically synthesized) DNA. The nucleic acid may be double-stranded or single- 
stranded. Where single-stranded, the nucleic acid may be the sense strand or the 

25 antisense strand. A nucleic acid molecule may be any chain of two or more covalently 
bonded nucleotides, including naturally occurring or non-naturally occurring 
nucleotides, or nucleotide analogs or derivatives. By "RNA" is meant a sequence of 
two or more covalently bonded, naturally occurring or modified ribonucleotides. One 
example of a modified RNA included within this term is phosphorothioate RNA. By 

30 "DNA" is meant a sequence of two or more covalently bonded, naturally occurring or 
modified deoxyribonucleotides. By "cDNA" is meant complementary or copy DNA 
produced firom an RNA template by the action of RNA-dependent DNA polymerase 
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(reverse transcriptase). Thus a "cDNA clone" means a duplex DNA sequence 
complementary to an RNA molecule of interest, carried in a cloning vector. 

An "isolated nucleic acid" is a nucleic acid molecule that is free of the nucleic 
acid molecules that normally flank it in the genome or that is free of the organism in 
5 which it is normally found. Therefore, an "isolated" gene or nucleic acid molecule is in 
some cases intended to mean a gene or nucleic acid molecule which is not flanked by 
nucleic acid molecules which normally (in nature) flank the gene or nucleic acid 
molecule (such as in genomic sequences) and/or has been completely or partially 
purified from other transcribed sequences (as in a cDNA or RNA library). In some 

10 cases, an isolated nucleic acid molecule is intended to mean the genome of an organism 
such as a virus. An isolated nucleic acid of the invention may be substantially isolated 
with respect to the complex cellular milieu in which it naturally occurs. In some 
instances, the isolated material will form part of a composition (for example, a crude 
extract containing other substances), buffer system or reagent mix. In other 

1 5 circumstances, the material may be purified to essential homogeneity, for example as 
determined by PAGE or colurrm chromatography such as HPLC. The term therefore 
includes, e.g., a genome; a recombinant nucleic acid incorporated into a vector, such as 
an autonomously replicating plasmid or vims; or into the genomic DNA of a 
prokaryote or eukaryote, or which exists as a separate molecule (e.g., a cDNA or a 

20 genomic DNA fragment produced by PGR or restriction endonuclease treatment) 

independent of other sequences. It also includes a recombinant nucleic acid which is 
part of a hybrid gene encoding additional polypeptide sequences. Preferably, an 
isolated nucleic acid comprises at least about 50, 80 or 90 percent (on a molar basis) of 
all macromolecular species present. Thus, an isolated gene or nucleic acid molecule can 

25 include a gene or nucleic acid molecule which is synthesized chemically or by 
recombinant means. Recombinant DNA contained in a vector are included in the 
definition of "isolated" as used herein. Also, isolated nucleic acid molecules include 
recombinant DNA molecules in heterologous host cells, as well as partially or 
substantially purified DNA molecules in solution. In vivo and in vitro RNA transcripts 

30 of the DNA molecules of the present invention are also encompassed by "isolated" 
nucleic acid molecules. Such isolated nucleic acid molecules are useful in the 
manufacture of the encoded polypeptide, as probes for isolating homologous sequences 
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(e.g., from other species), for gene mapping (e.g., by in situ hybridization with 
chromosomes), or for detecting expression of the nucleic acid molecule in tissue (e.g., 
human tissue, such as peripheral blood), such as by Northern blot analysis. 

Various genes and nucleic acid sequences of the invention may be recombinant 
5 sequences. The term "recombinant" means that something has been recombinedj so that 
when made in reference to a nucleic acid construct the temi refers to a molecule that is 
comprised of nucleic acid sequences that are joined together or produced by means of 
molecular biological techniques. The term "recombinant" when made in reference to a 
protein or a polypeptide refers to a protein or polypeptide molecule which is expressed 

10 using a recombinant nucleic acid construct created by means of molecular biological 
techniques. The term "recombinant" when made in reference to genetic composition 
refers to a gamete or progeny with new combinations of alleles that did not occur in the 
parental genomes. Recombinant nucleic acid constructs may include a nucleotide 
sequence which is ligated to, or is manipulated to become ligated to, a nucleic acid 

15 sequence to which it is not ligated in nature, or to which it is ligated at a different 
location in nature. Referring to a nucleic acid construct as '"recombinant" therefore 
indicates that the nucleic acid molecule has been manipulated using genetic 
engineering, i.e. by human intervention. Recombinant nucleic acid constructs may for 
example be introduced into a host cell by transformation. Such recombinant nucleic 

20 acid constructs may include sequences derived from the same host cell species or from 
different host cell species, which have been isolated and reintroduced into cells of the 
host species. Recombinant nucleic acid construct sequences may become integrated 
into a host cell genome, either as a result of the original transformation of the host cells, 
or as the result of subsequent recombination and/or repair events. 

25 * As used herein, "heterologous" in reference to a nucleic acid or protein is a 

molecule that has been manipulated by human intervention so that it is located in a 
place other than the place in which it is naturally found. For example, a nucleic acid 
sequence from one species may be introduced into the genome of another species, or a 
nucleic acid sequence from one genomic locus may be moved to another genomic or 

30 extrachromasomal locus in the same species. A heterologous protein includes, for 
example, a protein expressed from a heterologous coding sequence or a protein 
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expressed from a recombinant gene in a cell that would not naturally express the 
protein. 

By "antisense," as used herein in reference to nucleic acids, is meant a nucleic 
acid sequence that is' complementary to one strand of a nucleic acid molecule. In some 
5 embodiments, an antisense sequence is complementary to the coding strand of a gene, 
preferably, a SARS virus gene. The preferred antisense nucleic acid molecule is one 
which is capable of lowering the level of polj^eptide encoded by the complementary 
gene when both are expressed in a cell. In some embodiments, the polypeptide level is 
lowered by at least 10%, or at least 25%, or at least 50%, as compared to the 

10 polypeptide level in a cell expressing only the gene, and not the complementary 
antisense nucleic acid molecule. 

A "probe" or "primer" is a single-stranded DNA or RNA molecule of defined 
sequence that can base pair to a second DNA or RNA molecule that contains a 
complementary sequence (the target). The stability of the resulting hybrid molecule 

1 5 depends upon the extent of the base pairing that occurs, and is affected by parameters 
such as the degree of complementarity between the probe and target molecule, and the 
degree of stringency of the hybridization conditions. The degree of hybridization 
stringency is affected by parameters such as the temperature, salt concentration, and 
concentration of organic molecules, such as formamide, and is determined by methods 

20 that are known to those skilled in the art. Probes or primers specific for SARS virus 
nucleic acid sequences or molecules may vary in length from at least 8 nucleotides to 
over 500 nucleotides, including any value in between, depending on the purpose for 
which, and conditions under which, the probe or primer is used. For example, a probe 
or primer may be 8, 10, 15, 20, or 25 nucleotides in length, or may be at least 30, 40, 

25 50, or 60 nucleotides in length, or may be over 100, 200, 500, or 1000 nucleotides in 
length. Probes or primers specific for SARS virus nucleic acid molecules may have 
greater ttian 20-30% sequence identity, or at least 55-75% sequence identity, or at least 
75-85% sequence identity, or at least 85-99% sequence identity, or 100% sequence 
identity to the nucleic acid sequences described herein. In various embodiments of the 

30 invention, probes having the sequences: 5'- ATg AAT TAC CAA gTC AAT ggT TAG 
-3', SEQ ID NO: 160; gAA gCT ATT CgT CAC gTT Cg-3', SEQ ID NO: 161; 5'- 
CTg TAg AAA ATC CTA gCT ggA g-3', SEQ ID NO: 162; 5'- CAT AAC CAg TCg 
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gTA CAg CTA-3% SEQ ID NO: 163; 5'- TTA TCA CCC gCgAAg AAg CT-3', SEQ 
ID NO: 164; 5'- CTC TAg TTg CATgAC AgC CCT C-3', SEQ ID NO: 165; 5'- TCg 
TgC gTg gAT TggCTT TgA TgT-3', SEQ ID NO: 166; 5'-ggg TTg ggA CTA TCC 
TAA gTg TgA-3', SEQ ID NO: 167; 5'-TAA CAC ACA AAC ACC ATC ATC A-3', 

5 SEQ ID NO: 168; 5'-ggT Tgg gAC TAT CCT AAg TgT gA-3% SEQ ID NO: 169; 5'- 
CCA TCA TCA gAT AgA ATC ATC ATA-3% SEQ ID NO: 170; 5'- CCT CTC TTg 
TTC TTg CTC gCA-3', SEQ ID NO: 171; 5'- TAT AgT gAg CCg CCA CAC Atg-3', 
SEQ ID NO: 172; 5'-TAACACACAACICCATCATCA-3', SEQ ID NO: 173; 5'- 
CTAACATGCTTAGGATAATGG-3', SEQ ID NO: 174; 5'- 
1 0 GCCTCTCTTGTTCTTGCTCGC-3', SEQ ID NO: 1 75; 5'- 
CAGGTAAGCGTAAAACTCATC-3', SEQ ID NO: 176; 5'- 

TACACACCTCAGCGTTG-3', SEQ ID NO: 177; 5'-CACGAACGTGACGAAT-3% 
SEQ ID NO: 178; 5'-GCCGGAGCTCTGCAGAATTC-3', SEQ ID NO: 179; 5'- 
CAGGAAACAGCTATGAC TTGCATCACCACTAGTTGTGCCACCAGGTT-3 

15 SEQIDNO: 180;5'- 

TGTAAAACGACGGCCAGTTGATGGGATGGGACTATCCTAAGTGTGA-3 SEQ 
ID NO: 181; 5'- GCATAGGCAGTAGTTGCATC-3 ' , SEQ ID NO: 182, as well as 
sequences amplified by specific combinations of these probes, may be excluded firom 
specific uses according to the invention. Probes can be detectably-labeled, either 

20 radioactively or non-radioactively, by methods that are known to those skilled in the 
art. Probes can be used for methods involving nucleic acid hybridization, such as 
nucleic acid sequencing, nucleic acid amplification by the polymerase chain reaction, 
single stranded conformational polymorphism (SSCP) analysis, restriction firagment 
polymorphism (RFLP) analysis. Southern hybridization, northern hybridization, in situ 

25 hybridization, electrophoretic mobility shift assay (EMSA), and other methods that are 
known to those skilled in the art. 

By "complementary" is meant that two nucleic acid molecviles, e.g., DNA or 
RNA, contain a sufficient number of nucleotides that are capable of forming Watson- 
Crick base pairs to produce a region of double-strandedness between the two nucleic 

30 acids. Thus, adenine in one strand of DNA or RNA pairs with thymine in an opposing 
complementary DNA strand or with uracil in an opposing complementary RNA strand. 
It will be understood that each nucleotide in a nucleic acid molecule need not form a 
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matched Watson-Crick base pair with a nucleotide in an opposing complementary 
strand to form a duplex. 

By "vector" is meant a DNA molecule derived, e.g., from a plasmid, 
bacteriophage, or mammalian or insect virus, or artificial chromosome, that may be 
5 used to introduce a polj-peptide, for example a S ARS virus polypeptide^ into a host cell 
by means of replication or expression of an operably linked heterologous nucleic acid 
molecule. By "operably linked" is meant that a nucleic acid molecule such as a gene 
and one or more regulatoiy sequences (e.g., promoters, ribosomal binding sites, 
terminators in prokaryotes; promoters, terminators, enhancers in eukaryotes; leader 

10 sequences, etc.) are connected in such a way as to permit the desired function e.g. gene 
expression when the appropriate molecules (e.g., transcriptional activator proteins) are 
bound to the regulatory sequences. A vector may contain one or more unique restriction 
sites and may be capable of autonomous replication in a defined host or vehicle 
organism such that the cloned sequence is reproducible. By "DNA expression vector" 

15 is meant any autonomous element capable of directing the synthesis of a recombinant 
peptide. Such DNA expression vectors include bacterial plasmids and phages and 
mammalian and insect plasmids and viruses. A "shuttle vector" is undei'stood as 
meaning a vector which can be propagated in at least two different cell types, or 
organisms, for example vectors which are first propagated or replicated in prokaryotes 

20 in order for, for example, subsequent transfection into eukaryotic cells. A "replicon" is 
a unit that is capable of autonomous replication in a cell and may includes plasmids, 
chromosomes (e.g., mini-chromosomes), cosmids, viruses, etc. A replicon may be a 
vector. 

A "host cell" is any cell, including a prokaryotic or eukaryotic cell, into which a 
25 replicon, such as a vector, has been introduced by for example transformation, 
transfection, or infection. 

An "open reading frame" or "ORF" is a nucleic acid sequence that encodes a 
polypeptide. An ORF may include a coding sequence having i.e., a sequence that is 
capable of being transcribed into mRNA and/or translated into a protein when 
30 combined with the appropriate regulatory sequences. In general, a coding sequence 
includes a 5' translation start codon and a 3' translation stop codon. 
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A "leader sequence" is a relatively short nucleotide sequence located at the 5' 
end of an RNA molecule that acts as a primer for transcription. 

A "transcriptional regulatory sequence" "TRS" or "intergenic sequence" is a 
nucleotide sequence that lies upstream of an open reading frame (ORF) and serves as a 
5 template for the reassociation of a nascent RNA strand-polymerase complex. 

A "frameshift mutation'' is caused by a shift in a open reading frame^ generally 
due to a deletion or addition of at least one nucleotide, such that an alternative 
polypeptide is ultimately translated. 

By "detectably labeled" is meant any means for markiag and identifying the 

10 presence of a molecule, e.g., an oligonucleotide probe or primer, a gene or fragment 
liiereof, a cDNA molecule, a polypeptide, or an antibody. Methods for detectably- 
labeling a molecule are well known in the art and include, without limitation, 
radioactive labeling (e.g., with an isotope such as ^^P or ^^S) and nonradioactive 
labeling such as, enzymatic labeling (for example, using horseradish peroxidase or 

15 alkaline phosphatase), chemiluminescent labeling, fluorescent labeling (for example, 
using fluorescein), bioluminescent labeling, antibody detection of a ligand attached to 
the probe, or detection of double-stranded nucleic acid. Also included in this definition 
is a molecule that is detectably labeled by an indirect means, for example, a molecule 
that is bound with a first moiety (such as biotin) that is, in turn, bound to a second 

20 moiety that may be observed or assayed (such as fluorescein-labeled streptavidin). 
Labels also include digoxigenin, luciferases, and aequorin. 

A "peptide," "protein," "polyprotein" or "polypeptide" is any chain of two or 
more amino acids, including naturally occurring or npn-naturally occurring amino acids 
or amino acid analogues, regardless of post-translational modification (e.g., 

25 glycosylation or phosphorylation). An "polyprotein", "polypeptide", "peptide" or 
"protein" of the invention may include peptides or proteins that have abnormal 
linkages, cross links and end caps, non-peptidyl bonds or alternative modifying groups. 
Such modified peptides are also within the scope of the invention. The term 
"modifying group" is intended to include structures that are directly attached to the 

30 peptidic structure (e.g., by covalent coupHng), as well as those that are indirectly 
attached to the peptidic structure (e.g., by a stable non-covalent association or by 
covalent coupling to additional amino acid residues, or mimetics, analogues or 
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derivatives thereof, which may flank the core peptidic structure). For example, the 
modifying group can be coupled to the amino-terminus or carboxy-terminus of a 
peptidic structure, or to a peptidic or peptidomimetic region flanking the core domain. 
Alternatively, the modifying group can be coupled to a side chain of at least one amino 
5 acid residue of a peptidic structure^ or to a peptidic or peptido-mimetic region flanking 
the core domain (e.g., through the epsilon amino group of a lysyl residue(s)5 through 
the carboxyl group of an aspartic acid residue(s) or a glutamic acid residue(s), through 
a hydroxy group of a tyrosyl residue(s), a serine residue(s) or a threonine residue(s) or 
other suitable reactive group on an amino acid side chain). Modifying groups 

10 covalently coupled to the peptidic structure can be attached by means and using 

methods well known in the art for linking chemical structures, including, for example, 
amide, alkylamino, carbamate or urea bonds. 

A "polyprotein" is the polypeptide that is initially translated from the genome of 
a plus-stranded RNA virus, for example, a S ARS virus. Accordingly, a polyprotein has 

15 not been subjected to post-translational processing by proteolytic cleavage into its 
processed protein products, and therefore, retains its cleavage sites. In some 
embodiments of the invention, the protease cleavage sites of a polyprotein may be 
modified, for example, by amino acid substitution, to result in a polyprotein that is 
incapable of being cleaved into its processed protein products. 

20 An antibody "specifically binds" or "selectively binds" an antigen when it 

recognizes and binds the antigen, but does not substantially recognize and bind other 
molecules in a sample, having for example an affinity for the antigen which is 10, 100, 
1000 or 10000 times greater than the affinity of the antibody for another reference 
molecule in a sample. A "neutralizing antibody" is an antibody that selectively 

25 interferes with any of the biological activities of a SARS virus polypeptide or 

polyprotein, for example, replication of the SARS virus, or infection of host cells. A 
neutralizing antibody may reduce the ability of a SARS virus polypeptide to carry out 
its specific biological activity by about 50%, or by about 70%, or by about 90% or 
more, or may completely abolish the ability of a SARS virus polypeptide to carry out 

30 its specific biological activity. Any standard assay for the biological activity of any 
SARS virus polypeptide, for example, assays detemiining expression levels, ability to 
infect host cells, or ability to replicate DNA, including those assays described herein or 



wo 2004/096842 PCT/CA2004/000626 

16 

known to those of skill in the art, may be used to assess potentially neutralizing 
antibodies that are specific for SARS virus polypeptides. 

A "signal sequence" is a sequence of amino acids that may be identified, for 
example by homology or biological activity to a peptide sequence with the known 
5 function of targeting a pol5^eptide to a particular region of the cell. A signal sequence 
or signal peptide may be a peptide of any length, that is capable of targeting a 
polypeptide to a particular region of the cell. In some embodiments, the signal 
sequence may direct the polypeptide to the cellular membrane so that the polypeptide 
may be secreted. In alternate embodiments, the signal sequence may direct the 
10 polypeptide to an intracellular compartaient or organelle, such as the Golgi apparatus, 
or to the surface of a virus, such as the SARS virus. In altemate embodiments, a signal 
sequence may range fi"om about 13 or 15 amino acids in length to about 60 amino acids 
in length. 

A "transmembrane protein" is an amphipathic protein having a hydrophobic 

1 5 region ("transmembrane domain") that spans the lipid bilayer of the cell membrane 
from the cytoplasm to the cell surface, or spans the viral envelope, interspersed 
between hydrophilic regions on both sides of the membrane. The number of 
hydrophobic regions in an amphipathic protein is often proportional to the number of 
times that proteins spans the lipid bilayer. Thus, a single transmembrane protein spans 

20 the lipid bilayer once, and has a single transmembrane domain, while a multi- 

transmembrane protein spans the lipid bilayer multiple times. Multi-transmembrane 
proteins may enable virus entry into a host cell, or act to initiate transduction of a signal 
firom the cell surface to the interior of the cell, for example, by a conformational change 
upon ligand binding. A 'transmembrane anchor" is a transmembrane domain that 

25 maintains a polypeptide in its position in the cell membrane or viral envelope and is 
generally hydrophobic. A transmembrane anchor may generally be in the stmcture of 
an alpha helix, i.e., a "transmembrane helix". Multi-transmembrane proteins may have 
multiple transmembrane alpha-helices. 

A "nuclear localization signal" is an amino acid sequence that pemiits the entry 

30 of a polypeptide into the nucleus of a cell through nuclear pores. A nuclear localization 
signal generally has a cluster of positively charged residues, for example, lysines. A 
"lysine-rich sequence" is a sequence having at least two contiguous lysine residues, or 
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at least three contiguous lysine residues. In some embodiments, a lysine-rich sequence 
may be a nuclear localization signal. 

An "ATP binding domain" is a consensus domain that is found in many ATP or 
GTP-binding proteins^ and that forms a flexible loop (P-loop) between alpha-helical 
5 and beta pleated sheet domains. The general consensus for an ATP binding domain 
may be (A or G)-XXXXGK-(S or T). 

A "RNA binding protein" is a protein that is capable of binding to a RNA 
molecule (see, for example, "RNA Binding Proteins: New Concepts in Gene 
Regulation" 1st ed, eds. K. Sandberg and S.E. Mulroney, Kluwers Academic 

10 Publishers, 2001). RNA binding proteins may contain common structural features such 
as arginine-rich tracts, for example, arginines altemating with aspartates, serines, or 
glycines, or zinc finger regions. RNA binding proteins may also have a common 
ribonucleotide sequence domain. RNA binding proteins are believed to play diverse 
roles in modulating post-transcriptional gene expression. 

15 An "immune response" includes, but is not limited to, one or more of the 

following responses in a mammal: induction of antibodies, B cells, T cells (including 
helper T cells, suppressor T cells, cytotoxic T cells, y5 T cells) directed specifically to 
the antigen(s) in a composition or vaccine, following administration of the composition 
or vaccine. An immune response to a composition or vaccine thus generally includes 

20 the development in the host mammal of a cellular and/or antibody-mediated response to 
the composition or vaccine of interest. In general, the immune response will result in 
prevention or reduction of infection by a SARS vims. 

An "immunogenic fragment" of a polypeptide or nucleic acid molecule refers to 
an amino acid or nucleotide sequence that elicits an immune response. Thus, an 

25 immunogenic fragment may include, without limitation, any portion of any of the 

SARS virus sequences described herein, or a sequence substantially identical thereto, 
that includes one or more epitopes (the antigenic determinant i.e., site recognized by a 
specific immune system cell, such as a T cell or a B cell). An "epitope" may include 
amino acids in a spatial orientation that they are non-contiguous in the amino acid 

30 sequence but are near each other due to the three dimensional confomiation of the 
polypeptide. A epitope may include at least 3, 5, 8, or 10 or more amino acids. 
Immunogenic firagments or epitopes may be identified using standard methods known 
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to those of skill in the art, such as epitope mapping techniques or antigenicity or 
hydropathy plots using, for example, the Omiga version 1.0 program from Oxford 
Molecular Group (see, for example, U. S. Patent No. 4,708,871). Immunogenic 
fragments or epitopes may also be identified using methods for determining three 
dimensional molecule structure such as X-ray crystallography or nuclear magnetic 
resonance. 

A "sample" may be a tissue biopsy, amniotic fluid, cell, blood, serum, plasma, 
urine, stool, sputum, conjunctiva, or any other specimen, or any extract thereof, 
obtained from a patient (human or animal), test subject, or experimental animal. A 
"sample" may also be a cell or cell line created under experimental conditions, and 
constituents thereof (such as cell culture supematants, cell fractions, infected cells, 
etc.). The sample may be analyzed to detect the presence of a S ARS virus gene, 
genome, polypeptide, nucleic acid molecule or virion, or to detect a mutation in a 
SARS virus gene, expression levels of a S ARS virus gene or polypeptide, or the 
biological ftinction of a SARS virus polypeptide, using methods that are known in the 
art. For example, methods such as sequencing, single-strand confomiational 
polymorphism (SSCP) analysis, or restriction fragment length polymorphism (RFLP) 
analysis of PGR products derived from a sample can be used to detect a mutation in a 
SARS virus gene; ELISA or westem blotting can be used to measure levels of SARS 
virus polypeptide or antibody affinity; northern blotting can be used to measure SARS 
mRNA levels, or PGR can be used to measure the level of a SARS viras nucleic acid 
molecule. 

Other features and advantages of the invention will be apparent from the 
following description of the drawings and the invention, and from the claims. 

Brief Description of the Drawings 
Figures lA-D show phylogenetic analyses of SARS proteins. Unrooted 
phylogenetic trees were generated by clustalw (Thompson, J. D. et al., Nucleic Acids 
Res 22, 4673-80, Nov 11, 1994) bootstrap analysis using 1000 iterations. Genbank 
accessions for protein sequences are as follows: Figure lA: Replicase lA: BoGov 
(Bovine Goronavirus):AAL40396, 229E (Human Goronaviras):NP_07355, MHV 
(Mouse Hepatitis Virus):NP_045298, AIBV (Avian Infectious bronchitis 
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virus):CAC391 13, TGEV (Transmissible Gastroenteritis Virus): NP_058423. Figure 
IB: Matrix Glycoprotein: PHEV (Porcine hemagglutinating encephalomyelitis 
virus): AAL80035, BoCov (Bovine Coronavirus):NP_l 50082, AIBV & AIBV2 (Avian 
infectious bronchitis" virus): AAF35863 & AAK83027, MHV (Mouse hepatitis 
5 virus):AAF36439, TGEV (Transmissible gastroenteritis virus):NP_058427, 229E & 
OC43 (Human Coronavims): NP__073555 & AAA45462, FCV (Feline 
coronavirus):BAC01 160. Figure IC: Nucleocapsid: MHV (Mouse hepatitis 
virus):P18446, BoCov (Bovine coronavirus):NP_l 50083, AIBV (Avian infectious 
bronchitis virus):AAK27162, FCV (Feline coronavims) :CAA74230, PTGV (Porcine 
10 transmissible gastroenteritis virus): AAM97563, 229E & OC43 (Hxmian 
coronavirus):NP_073556 & P33469, PHEV (porcine hemagglutinating 
encephalomyelitis virus):AAL80036, TCV (Turkey coronaviras):AAF23873. Figure 
ID: S (Spike) Protein: BoCov (Bovine coronavims) :AAL40400, MHV (Mouse 
hepatitis vims): PI 1225, OC43 & 229E (Human coronavims): S44241 & AAK32191, 
15 PHEV (Porcine hemagglutinating encephalomyelitis vims):AAL8003 1 , PRC (Porcine 
respiratory coronavims): A AA46905, PEDV (Porcine epidemic diarrhea 
vims):CAA80971, CCov (Canine coronavims) :S4 1453, FICV (Feline infectious 
peritonitis viras):BAA06805, AIBV (Avian infectious bronchitis vims):AA034396. 

Figure 2 shows a schematic representation of the ORFs and s2m motif in the 
20 29,736-base SARS vims genome. 

Figures 3A-P show nucleotide sequences of the 29,736-base genome of the 
SARS vims (SEQ ID NOs: 1 and 2). 

Figure 4 shows an aligmnent of the s2m regions from Avian infectious 
bronchitis vims (AIBV; SEQ ID NO: 32) and equine rhinovims serotype 2 (ERV-2; 
25 SEQ ID NO: 31) with the 3' untranslated region (UTR; SEQ ID NO: 18) of the SARS 
vims (TOR2). The conserved areas in the s2m region are indicated by asterisks. 

Figure 5 shows the amino acid sequence of the SARS vims S (Spike) 
Glycoprotein (SEQ ID NO: 33). 

Figure 6 shows the amino acid sequence of the SARS vims M (Matrix) 
30 Glycoprotein (SEQ ID NO: 34). 

Figure 7 shows the amino acid sequence of the SARS vims E (Small envelope) 
protein (SEQ ID NO: 35). 
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Figure 8 shows the amino acid sequence of the SARS viras N (Nucleocapsid) 
Protein (SEQ ID NO: 36). 

Figure 9 shows an alignment of the matrix glycoprotein M from the SARS 

virus (Tor2_M or ORF5; SEQ ID NO: 34) aad various other matrix glycoproteins (SEQ 
5 ID NOs: 37-43). Asterisks C") indicate percentage identity to the SARS matrix protein 
as calculated by Ahgn (Myers and Miller, CABIOS (1989) 4:1 1-17). 

Figures lOA-B show an alignment of the nucleocapsid protein N from the 
SARS virus (Tor2_N; SEQ ID NO: 36) and various other nucleocapsid proteins (SEQ 
ID NOs: 44-52). Asterisks (*) indicate percentage identity to the SARS nucleocapsid 
10 protein calculated by Align (Myers and Miller, CABIOS (1989) 4:1 1-17) Figures 
1 1 A-K show the nucleotide sequence of the 29,751 -base genome of the SARS virus 
(SEQ ID NO: 15). 

Figure 12 shows a schematic representation of the ORFs and s2m motif in the 
29,751 -base SARS virus genome. 

15 Figures 13A-D show phylogenetic analyses of SARS proteins. Unrooted 

phylogenetic trees were generated by clustalw 1.74 (J. D. Thompson, D. G. Higgins, T. 
J. Gibson, Nucleic Acids Res 22, 4673-80 (Nov 1 1, 1994; using the BLOSUM 
comparison matrix and a bootstrap analysis of 1000 iterations. Numbers indicate 
bootstrap replicates supporting each node. Phylogenetic trees were drawn with the 

20 Phylip Drawtree program 3.6a3 (Felsenstein, J. 1993. PHYLIP (Phylogeny Liference 

Package) version 3.5c. Distributed by the author. Department of Genetics, University of 
Washington, Seattle^. Branch lengths indicate the number of substitutions per residue. 
Genbank accessions for protein sequences: A: Replicase 1 A: BoCoV (Bovine 
Coronavirus):AAL40396, HCoV-229E (Human Coronavirus):NP_07355, MHV 

25 (Mouse Hepatitis Virus):NP_045298, IBV (Avian Infectious bronchitis 

virus):CAC391 13, TGEV (Transmissible Gastroenteritis Virus): NP_058423. B: 
Membrane Glycoprotein: PHEV (Porcine hemagglutinating encephalomyelitis 
virus):AAL80035, BoCoV (Bovine Coronavirus):NP„l 50082, IBV & IBV2 (Avian 
infectious bronchitis virus): AAF35863 & AAK83027, MHV (Mouse hepatitis 

30 virus):AAF36439, TGEV (Transmissible gastroenteritis virus):NP_058427, HCoV- 

229E & HCoV-OC43 (Human Coronavirus): NP_073555 & AAA45462, FCoV (Feline 
coronavirus):BAC01 160. C: Nucleocapsid: MHV (Mouse hepatitis virus) :P 18446, 
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BoCoV (Bovine coronavirus):NP_l 50083, IBV 1 & 2 (Avian infectious bronchitis 
virus): AAK27162 &NP_040838, FCoV (Feline coronavirus):CAA74230, PTGV 
(Porcine transmissible gastroenteritis virus): AAM97563, HCoV-229E & HCoV-OC43 
(Human coronavirus):NP_073556 & P33469, PHEV (porcine hemagglutinating 
5 encephalomyelitis virus): AAL80036, TCV (Turkey coronavirus):AAF23873. D: S 
(Spike) Protein: BoCoV (Bovine coronavirus):AAL40400, MHV (Mouse hepatitis 
virus): P11225, HCoV-OC43 & HCoV-229E (Humaa coronavirus):S44241 & 
AAK32191, PHEV (Porcine hemagglutinating encephalomyelitis virus):AAL80031, 
PRCoV (Porcine respiratory coronaviras):AAA46905, PEDV (Porcine epidemic 
10 diarrhea virus):CAA80971, CCoV (Canine coronavims):S41453, FIPV (Feline 
infectious peritonitis virus) :BAA06805, IBV (Avian infectious bronchitis 
virus):AA034396. 

Figures 14A-F show an aligmnent of the spike glycoprotein S jfrom the SARS 
virus (Tor2_S; SEQ ID NO: 33) and various other spike glycoproteins (SEQ ID NOs: 
15 53-62). Asterisks (*) indicate percentage identity to the SARS spike protein as 
calculated by Align (Myers and Miller, CABIOS (1 989) 4:11-17). 

Figure 15 shows an aligmnent between the SARS virus Small envelope protein 
E (TOR2_E; SEQ ID NO: 35) and the Envelope protein (Protein 4) (XI protein) (ORE 
3) from Porcine transmissible gastroenteritis coronavirus (strain Purdue). Swissprot 
20 accession number P09048 (PGV; SEQ ID NO: 63), as calculated by FASTA 
(http://www.ebi.ac.uk/fasta33/). 

Figures 16A-B show the amino acid sequence of the SARS virus Replicase 1 A 
protein (SEQ ID NO: 64). 

Figure 17 shows the amino acid sequence of the SARS virus Replicase IB 
25 protein (SEQ ID NO: 65). 

Figure 18 shows the amino acid sequence of ORF3 of SARS virus (SEQ ID 
NO: 66). 

Figure 19 shows the amino acid sequence of ORF4 of SARS virus (SEQ ID 
NO: 67). 

30 Figure 20 shows the amino acid sequence (SEQ ID NO: 68) of ORF6 

(nucleotides 27059-27247 of the 29,736-base genome sequence) or ORE 7 (nucleotides 
27,074-27,265 of the 29,751-base genome sequence) of SARS virus. 
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Figure 21 shows the amino acid sequence (SEQ ID NO: 69) of ORF7 
(nucleotides 27258-27623 of the 29,736-base genome sequence) or ORF 8 (nucleotides 
27,273-27,641 of the 29,751-base genome sequence),of SARS virus. 

Figure 22 shows the amino acid sequence (SEQ ID NO: 70) of ORF8 
5 (nucleotides 27623-27754 of the 29,736-base genome sequence) or ORF9 8 

(nucleotides 27,638-27,772 of the 29,751 -base genome sequence) of SARS virus. 

Figure 23 shows the amino acid sequence (SEQ ID NO: 71) of ORF9 
(nucleotides 27764-27880 of the 29,736-base genome sequence) or ORFIO (nucleotides 
27,779-27,898 of the 29,751-base genome sequence) of SARS virus. 
10 Figure 24 shows the amino acid sequence (SEQ ID NO: 72) of ORFIO 

(nucleotides 27849-28100 of the 29,736-base genome sequence) or ORFl 1 (nucleotides 
27,864-281 18 of the 29,751 -base genome sequence) of SARS virus. 

Figure 25 shows the amino acid sequence of ORF 13 of SARS vims (SEQ ID 
NO: 73). 

15 Figure 26 shows the amino acid sequence of ORF 14 of SARS vims (SEQ ID 

NO: 74). 

Figure 27 shows an alignment of the secreted region of the SARS vims ORF 1 0 
of the 29,751 -base genome sequence (sars) with the conotoxin from Conns ventricosus 
(conotoxin). Sequence identity is indicated by asterisks and sequence homology is 
20 indicated by dots. 



Detailed Description of the Invention 
In general, the invention provides nucleic acid molecules, polypeptides, and 
25 other reagents derived from a SARS vims, as well as methods of using such nucleic 
acid molecules, polypeptides, and other reagents. 

The genome sequence (Figures 3A-P, llA-K, SEQ ID NOs: 1, 2, and 15) 
reveals that the SARS coronavims is only moderately related to other known 
coronaviruses, including two human coronaviruses, OC43 and 229E. Thus, the SARS 
30 vims is a previously unloiown vims. The 5' end of the SARS genome contains a 5' 
leader sequence (Table 1 ; SEQ ID NO: 3) with sequence similarity to the highly 
conserved coronavims core leader sequence, 5'-CUAAAC-3 (SEQ ID NO: 75; 
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Sawicki, S. G,, et dX., Adv Exp Med Biol AAO, 215-9, 1998; Lai, M. M. and D. 
Cavanagh, Adv Virus Res 48, 1-100,1997). Transcriptional regulatory sequences 
(TRSs) were identified upstream of all open reading frames (ORFs) (Tables 1 and 2; 
SEQ ID NOs: 3-13 and 20-30). ORF9 and ORFIO of the 29,736-base SARS genome 
5 (ORF 10 and ORF 1 1 of the 295751 base genome) overlap by 12 amino acids^ and have 
matches to the TRS consensus in close proximity to their respective initiating 
methionine codons. 

The 3' UTR sequence (SEQ ID NO: 18) of SARS virus contains a s2m region 
having the sequence ACATTTTCATCGAGGCCACGCGGAGTACGAT 

10 CGAGGGTACAGTGAAT; SEQ ID NO: 16) that includes a conserved, 

discontinuous 32 base-pair s2m motif. The conserved 32 base-pair motif is a universal 
feature of astroviruses that has also been identified in avian coronavirus (AIB V) and 
the ERV-2 equine rhinovirus. This motif has been identified by Jonassen CM. et al. (J 
Gen Virol 1998 Apr;79 ( Pt 4):715-8) as GCCGNGGCCACGC(G/C) 

15 GAGTA(C/G)GANCGAGGGTACAG(G/C) (SEQ ID NO: 19), where N is generally 
not part of the consented motif, and can be any nucleotide. The region corresponding 
to the 32 base-pair motif in SARS virus includes the sequence: 

CGAGGCCACGCGGAGTACGATCGAGGGTACAG (SEQ ID NO: 17), and spans 
positions 29590-29621 of the 29,751 base genome. Figure 4 shows an alignment of the 
20 s2m regions firom Avian infectious bronchitis virus (AIBV; SEQ ID NO: 32) and 

equine rhinovirus serotype 2 (ERV-2; SEQ ID NO: 31), as defined in Jonassen CM. et 
al. (J Gen Virol 1998 Apr;79 ( Pt 4):715-8), with the entire 3' untranslated region 
(UTR) of the SARS virus (TOR2) (SEQ ID NO: 18). 
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Table 1. Listing of the transcription regulatory sequences of the 29,736-base SARS genome, 
showing the nucleotide position (base) and associated open-reading frames (ORF). Aq asterisk 
(*) indicates consensus sequence. 



Base 


ORF 


TRS Sequence 










45 


Leader 


TCTCTAAACGAACTTTAAAATCTGTG 


(SEQ 


ID 


NO: 


3) 


21464 


S 


CAACTAAACGAACATG 


(SEQ 


ID 


NO: 


4) 


25238 


0RF3 


CACATAAACGAACTTATG 


(SEQ 


ID 


NO: 


5) 


26089 


E 


TGAGTACGAACTTATG 


(SEQ 


ID 


NO: 


6) 


26326 


M 


GGTCTAAACGAACTAACT 4 0 ATG 


(SEQ 


ID 


NO: 


7) 


26986 


ORF6 


AACTATAAATT 62 ATG 


(SEQ 


ID 


NO: 


8) 


27244 


0RF7 


TCCATAAAACGAACATG 


(SEQ 


ID 


NO: 


9) 


27575 


0RF8 


TGCTCTA GTATTTTTAATACTTTG 24 ATG 


(SEQ 


ID 


NO: 


10) 


27751 


ORF9 


AGTCTAAACGAACATG 


(SEQ 


ID 


NO: 


11) 


27837 


ORFIO 


CTAATAAACCTCATG 


(SEQ 


ID 


NO: 


12) 


28084 


N 


TAAATAAACGAACAAATTAAAATG 


(SEQ 


ID 


NO: 


13) 



■k -k -k -k -k -k ^ -k 
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Table 2. Listing of the transcription regulatory sequences of the 29,751 -base SARS genome, 
showing the nucleotide position (base), associated open-reading frames (ORF), and identified 
transcription regulatory sequences. Numbers in parentheses within the aligmnent indicate 
distance to the putative initiating codon. The conserved core sequence is indicated in bold in 
5 the putative leader sequence. Contiguous sequences identical to region of the leader sequence 
cont£iining the core sequence are shaded. No putative TRSs were detected for ORFs 4, 13 and 
14, althougli ORF 13 may share the TRS associated with the N protein. 





Base 


ORF 


TRS Sequence 




10 


60 


Leader 


UCUCUAAACGAACUUUAAAAUCUGUG(SEQ ID NO: 


20) 




21479 


S (Spike) 


CAACUAAACpkACAUG (SEQ ID NO: 


21) 




25252 


0RF3 


CACAUAAACGAACUUAUG (SEQ ID NO: 


22) 




26104 


Envelope 


UGAGUACGAAteUUAUG (SEQ ID NO: 


23) 




26341 


M 


GGpCUJiy^CGAASC^ (40) AUG (SEQ ID NO: 


24) 


15 


27001 


0RF7 


ii^CT^fASuU (62) AUG (SEQ ID NO: 


25) 




27259 


ORFS 


UCCAUAI^i^CG^CAUG (SEQ ID NO: 


26) 




27590 


ORF 9 


UGtePCip GUAUUf UUAMApgU^ (24) AUG ( SEQ 


ID NO: 27) 




27766 


ORFIO 


AGgCt!pL?V0GA^^ (SEQ ID NO: 


28) 




27852 


ORFll 


CUAAU^|A^3CUCAUG (SEQ id NO: 


29) 


20 


28099 


NUCLEOCAPSID 


X3T^J^i3Pi^(SMA^ (SEQ ID NO: 


30) 



The coding potentials of the 29,736-base and 29,75 1-base genomes are depicted 
in Figures 2 and 12, respectively. Open reading frames (ORFs) include the Replicase 
la and lb translation products, the Spike glycoprotein, the small Envelope protein, the 

25 Membrane and the Nucleocapsid protein. Construction of unrooted phylogenetic trees 
nsing this set of known proteins from representatives of the three known coronaviral 
groups reveals that the proteins encoded by the SARS virus do not readily cluster more 
closely with any known group than with any other (Figures 1 A-D and 13A-D). In 
addition, nine novel ORFs have been analyzed. 

30 The Replicase la ORF located at nucleotides 250-13395 of the 29,736-base 

genome, and nucleotides 265-13,398 of the 29,75 1-base genome, and replicase lb ORF 
located at nucleotides 13395-21467 of the 29,736-base genome, and nucleotides 13,398 
- 21,485 of the 29,75 1-base genome, occupy 21.2 kb of the SARS virus genome 
(Figures 2 and 12). These genes encode a number of proteins that are produced by 

35 proteolytic cleavage of a large polyprotein (Ziebuhr, J. et al., J Gen Virol 81, 853-79, 
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Apr, 2000). A frame shift mutation interrupts the protein-coding region, separating the 
la and lb open-reading frames. The proteins encoded by the Replicase la and lb 
ORFs are depicted in Figures 16A-B and 17, SEQ ID NOs: 64 and 65). 

The Spike glycoprotein (S) (E2 glycoprotein gene; Figures 2 and 12; 
5 nucleotides 21477 to 25241 of the 295736-base genome, and nucleotides 21,492 to 

25,259 of the 29,751 -base genome) encodes a surface projection glycoprotein precursor 
of about 1,255 amino acids in length (Figure 5; SEQ ID NO: 33), which may be 
significant in the viralence of the SARS virus. Mutations in this gene are correlated 
with altered pathogenesis and vimlence in other coronaviruses (B. N. Fields et aL, 

10 Fields virology (Lippincott WilUams & Wilkins, Philadelphia, ed. 4*^ 2001). Li other 
coronaviruses, the mature spike protein is inserted in the viral envelope with the 
majority of the protein exposed on the surface of the particles. Three molecules of the 
Spike protein form the characteristic peplomers or corona-like structures of this virus 
family. Analysis of the spike glycoprotein with Signal? (Nielson, H. et al., Prot 

15 Engineer 70:1-6 (1997) indicates a signal peptide (MFIFLLFLTLTSG; SEQ ID NO: 
76)(probability 0.996) with cleavage between residues 13 and 14. TMHMM 
(Sonnhammer, E. L. et al., Proc int Conf Intell Syst Mol Biol 6, 175-82 (1998)) 
indicates a transmembrane domain near the C-terminal end 

(WYVWLGFIAGLIAIVMVTILLCC; SEQ ID NO: 183). Together these data indicate 
20 a type I membrane protein with N-temiinus and the majority of the protein (residues 
14-1 195) on the outside of the cell-surface or virus particle, which may be responsible 
for binding to a cellular receptor. The SARS virus Spike glycoprotein has limited 
sequence identity to other, known Spike glycoproteins (Figures 14A-F). 

ORF 3 (Figures 2 and 12; nucleotides 25253-26074 of the 29,736-base genome 
25 and nucleotides 25,268 - 26,092 of the 29,751 -base genome) encodes a protein of 274 
amino acids (Figure 18; SEQ ID NO: 66) that lacks significant similarities to any 
known protein when analyzed with BLAST (Altschul, S. F. et al., Nucleic Acids Res 25, 
3389-402, Sep 1, 1997), FASTA (Pearson, W. R. and D. J. Lipman, Proc Natl Acad Sci 
USA 85, 2444-8, Apr, 1988) or PFAM (Bateman, A. et al., Nucleic Acids Res 30, 276- 
30 80, Jan 1, 2002). Analysis of the N-terminal 70 amino acids with Signal? indicates the 
existence of a signal peptide (MDLFMRFFTLRSITAQ; SEQ ID NO: 184) and a 
cleavage site (probability 0.540). Both TMpred (Hofinan, K. and W. Stoffel, Biol 
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Chem. Hoope-Seyler 374, 166 (1993) and TMHMM indicate three trans-membrane 
regions spanning approximately residues 34-56 (TIPLQASLPFGWLVIGVAFLAVF, 
SEQ ID NO; 77), 77-99 (FQFICNLLLLFVTIYSHLLLVAA, SEQ ID NO: 78), and 
103-125 (AQFLYLYALIYFLQCINACRIIM, SEQ ID NO: 79). Both TMpred and 

5 TMHMM indicate that the C-terminus and a large 149 amino acid domain is located 
inside the viral or cellular membrane. The C-temiinal (interior) region of the protein, 
corresponding to about amino acids 124-274 

(MRCWLCWKCKSKNPLLYDANYFVCWHTHNYDYCIPYNSVTDTIVVTEGDGI 
STPKLKEDYQIGGYSEDRHSGVKDYVVVHGYFTEVYYQLESTQITTDTGIENAT 

1 0 FFIFNKLVKDPPNVQffiTIDGSSGVANP AMDPIYDEPTTTTSVPL; SEQ ID NO: 
1 85) may encode a protein domain with ATP-binding properties (PD037277). 

ORF 4 (Figure 12; nucleotides 25,689 - 26,153 of the 29,751-base genome) 
encodes a predicted protein of 154 amino acids (Figure 19; SEQ ID NO: 67). This 
ORF overlaps entirely with ORF 3 and the E protein. ORF4 may be expressed from the 

15 ORF mRNA using an internal ribosomal entry site. BLAST analyses failed to identify 
matching sequences. Analysis with TMPred predicts a single transmembrane helix, 
amino acids 1-20 MMPTTLFAGTHITMTTVYHI, SEQ ID NO: 186. 

The small envelope protein E (Figures 2 and 12; nucleotides 26102-26329 of 
the 29,736-base genome and nucleotides 26,1 17 - 26,347, ORF 5, of the 29,751- 

20 genome) encodes a protein of 76 amino acids (Figure 7; SEQ ID NO: 35). BLAST and 
FASTA comparisons indicate that the protein, while novel, is homologous to multiple 
envelope proteins (altematively known as small membrane proteins) from several 
coronaviruses. An alignment of the SARS virus E protein with the envelope protein of 
Porcine transmissible gastroenteritis coronavirus indicates approximately 28% 

25 sequence identity between the two proteins over a 61 amino acid overlap, as calculated 
by FASTA (Figure 1 5). PFAM analysis of the protein indicates that the small envelope 
protein E is a member of the NS3_EnvE protein family. InterProScan (R. Apweiler et 
al., Nucleic Acids Res 29, 37-40, Jan 1, 2001; Zdobnov, E. M. and R. Apweiler, 
Bioinformatics 11, 847-8, Sep, 2001) analysis indicates that the protein is a component 

30 of the viral envelope, and homologs of it are found in other vimses, including 

gastroenteritis vims and murine hepatitis vims. SignalP analysis indicates the presence 
of a transmembrane anchor (probability 0.939). TMpred analysis indicates a similar 
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transmembrane anchor at positions 17-34 (VLLFLAFWFLLVTLAIL, SEQ ID NO: 
80), which is consistent with the known association of homologous proteins with the 
viral envelope. TMHMM indicates a type II membrane protein with the majority of the 
46 residue C temiinus hydrophiUc domain ( 
5 TALRLCAYCCNIVNVSLVKPTVYWSRWNLNSSEGVPDLLV; SEQ ID NO: 
1 87) located on the surface of the viral particle. The E protein may be important for 
viral replication. 

The Matrix glycoprotein M (Figures 2 and 12; nucleotides 26383-27045 of the 
29,736-base genome and nucleotides 26,398 - 27,063, ORF 6, of the 29,751 -genome) 

10 encodes a protein of 221 amino acids (Figure 6; SEQ ID NO: 34). BLAST and FASTA 
analysis of the protein, while novel, reveals homologies to coronaviral matrix 
glycoproteins (Figure 9). The association of the spike glycoprotein (S) with the matrix 
glycoprotein (M) may be an essential step in the formation of the viral envelope and in 
the accumulation of both proteins at flie site of virus assembly. Analysis of the amino 

15 acid sequence with Signal? indicates a signal sequence (probability 0.932), located at 
approximately residues 1-39 

(MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYS; SEQ ID NO: 188) 
that is unlikely to be cleaved. TMHMM and TMpred analysis both indicate the 
presence of three trans -membrane helices, located at approximately residues 15-37 

20 (LLEQWNLVIGFLFLAWIMLLQFA; SEQ ID NO: 81), 50-72 

(LVFLWLLWPVTLACFVLAAVYRI; SEQ ID NO: 82) and 77-99 
(GGIAIAMACIVGLMWLSYFVASF; SEQ ID NO: 83), with the 121 amino acid 
hydrophilic domain on the inside of the virus particle, where it may interact with 
nucleocapsid. The hydrophilic domain may mn from approximately amino acids 

25 PLRGTIVTRPLMESELVIGAVIIRGHLRMAGHSLGRCDIKDLPKEITVATSRTLS 
YYKLGASQRVGTDSGFAAYNRYRIGNYKLNTDHAGSNDNIALLVQ (SEQ ID 
NO: 189) i.e. approximately amino acids 95 or 99 to 221 of SEQ ID NO: 34. PFAM 
analysis reveals a match to PFAM domain PF01635, and alignments to 85 other 
sequences in the PFAM database bearing this domain, which is indicative of the 

30 coronavinis matrix glycoprotein. 

ORF6 (Figure 2; nucleotides 27059-27247 of the 29,736-base genome 
sequence) or ORF 7 (Figure 12; nucleotides 27,074-27,265 of the 29,751-base genome 
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sequence) encodes a protein of 63 amino acids (Figvire 20; SEQ ID NO: 68). TMpred 
analysis indicates a trans-membrane helix located between residues 3 or 4 and 22 
(HLVDFQVTIAEILIIIMRTF; SEQ ID NO: 84), with the N-tenninus located outside 
the viral particle. 

5 Similarly, the gene encoding ORF7 (Figure 2; nucleotides 27258-27623 of the 

29,736-base genome sequence) or ORF 8 (Figure 12; nucleotides 27,273-27,641 of the 
29,751 -base genome sequence), encoding a protein of 122 amino acids (Figure 21; SEQ 
ID NO: 69), has no significant BLAST or FASTA matches to known proteins. 
Analysis of this sequence with SignalP indicates a cleaved signal sequence 

10 (MKIILFLTLIVFTSC; SEQ ID NO: 85) (probability 0.995), with the cleavage site 
located between residues 15 and 16. TMpred and TMHMM analysis also indicates a 
trans-membrane helix located approximately at residues 99-1 17 
(SPLFLIVAALVFLILCFTI; SEQ ID NO: 86). Together these data indicate that this 
protein is a type I membrane protein with the major hydrophilic domain of the protein 

1 5 (residues 1 6-98; ELYHYQECVRGTTVLLKEPCP 

SGTYEGNSPFHPLADNKFALTCTSTHFAFACADGTRHTYQLRARSVSPKLFIRQ 
EEVQQELY; SEQ ID NO: 87) and the amino-terminus is oriented inside the lumen of 
the ER/Golgi, or on the surface of the cell membrane or virus particle^depending on the 
membrane localization of the protein. 

20 ORF8 (Figure 2; nucleotides 27623-27754 of the 29,736-base genome 

sequence) or ORF9 (Figure 12; nucleotides 27,638-27,772 of the 29,751 -base genome 
sequence), encodes a protein of 44 amino acids (Figure 22; SEQ ID NO: 70). FASTA 
analysis of this sequence revealed some weak similarities (37% identity over a 35 
amino acid overlap) to Swiss-Prot accession Q9M883, annotated as a putative sterol-C5 

25 desaturase. A similarly weak match to a hypothetical Clostridium perfringens protein 
(Swiss-Prot accession CPE2366) was also detected. TMpred indicated a single strong 
trans-membrane helix FYLCFLAFLLFLVLIMLIIFWFS, SEQ ID NO: 190, with little 
preference for alternate models in which the N-terminus was located inside or outside 
the particle. 

30 Similarly ORF9 (Figure 2; nucleotides 27764-27880 of the 29,736-base genome 

sequence) or ORFIO (Figure 12; nucleotides 27,779-27,898 of the 29,751-base genome 
sequence) encoding a protein of 39 amino acids (Figure 23; SEQ ID NO: 71), exhibited 
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no significant matches in BLAST and FASTA searches but encodes a trans-membrane 
helix LLIVLTCISLCSCICTWQ (SEQ ID NO: 191) by TMPred, with the N-terminus 
located within the viral particle. The region immediately upstream of this protein 
exhibits a strong match to the TRS consensus (Table 2), indicating that a transcript 
5 initiates ifrom this site. The large number of cysteine residues (6) may result in cross 
hnking of the amino acids. Amino acids ICTWQRCASNKPHVLEDPCKVQH (SEQ 
ID NO: 192) of this protein maybe secreted. The secreted amino acids exhibit 
homology to toxin proteins, for example, to the conotoxin of Conns ventricosus (Figure 
27). Antigenic peptides from the hydrophilic (secreted) region, for example, 

10 CICTWQRCASNKPHVLEDPCK (SEQ ED NO: 193), were used to generate 

monoclonal antibodies using standard techniques. Furthemiore, the C terminal amino 
acids form a sequence that shares homology to famesylation sites (CKQH), which 
generally require C terminal location to be functionaL This protein may act as a 
virulence factor and/or may facilitate transmission to humans. 

15 ORFIO (Figure 2; nucleotides 27849-28100 of the 29,736-base genome 

sequence) or ORFll (Figure 12; nucleotides 27,864-28118 of the 29,751-base genome 
sequence) encoding a protein of 84 amino acids (Figure 24; SEQ ID NO: 72) exhibited 
only very short (9-10 residues) matches to a region of the human coronavirus E2 
glycoprotein precursor (starting at residue 801). Analysis by Signal? and TMHMM 

20 predict a soluble protein. A detectable alignment to the TRS consensus sequence was 
also found (Table 2). 

The protein (422 amino acids; Figure 8; SEQ ID NO: 36) encoded by the 
Nucleocapsid gene (Figure 2; nucleotides 28105-29370 of the 29,736-base genome 
sequence; Figure 12, nucleotides 28,120-29,388 of the 29,751-base genome sequence) 

25 aligns well with nucleocapsid proteins jfrom other representative coronaviruses (Figures 
lOA-B), although a short lysine rich region (KTFPPTEPKKDKKKKTDEAQ; SEQ ID 
NO: 14) is unique to SARS. This region is suggestive of a nuclear localization signal 
Since some coronaviruses are able to replicate in enucleated cells, the SARS virus 
nucleocapsid protein may have evolved a novel nuclear function, which may play a role 

30 in pathogenesis. In addition, the basic nature of this peptide suggests it may assist in 
RNA binding. The SARS nucleocapsid protein is also a good candidate for diagnostic 
tests. 
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ORF 13 (Fig. 12; nucleotides 28,130 - 28,426 of the 29,751-base genome 
sequence) encodes a novel protein of 98 amino acids (Figure 25; SEQ ID NO: 73). 
ORF 14 (Fig. 12; nucleotides 28,583 - 28,795 of the 29,751-base genome sequence) 
encodes a novel protein of 70 amino acids (Figure 26; SEQ ID NO: 74). TMPred 
5 predicts a single transmembrane helix VVAVIQEIQLLAAVGEILLLEW (SEQ ID NO: 
194). 

Various features of the SARS virus genome are summarised in Table 3. While 
Table 3 refers to the 29,751-base genome sequence, the features are also applicable to 
the 29,736-base genome sequence (SEQ ID NOs: 1 and 2). 

0 

Table 3. Features of the SARS virus 29,751-base genome sequence. 



Feature 


Start - End^ 


No. amino acids 


No. bases 


Frame 


TRS 


Orf la 


265 - 13,398 


4,382 


13,149 


+1 


N/A 


Orf lb 


13,398-21,485 


2,628 


7,887 


+3 


N/A 


S protein 


21,492-25,259 


1,255 


3,768 


+3 


Strong 


Orf 3 


25,268 - 26,092 


274 


825 


+2 


Strong 


Orf 4 


25,689-26,153 


154 


465 


+3 


Absent^ 


E protein 


26,117-26,347 


76 


231 


+2 


Weak 


M protein 


26,398 - 27,063 


221 


666 


+1 


Strong 


Orf? 


27,074 - 27,265 


63 


192 


+2 


Weak 


OrfS 


27,273 - 27,641 


122 


369 


+3 


Strong 


Orf 9 


27,638-27,772 


44 


135 


+2 


Weak 


Orf 10 


27,779-27,898 


39 


120 


+2 


Strong 


Orf 11 


27,864-28,118 


84 


255 


+3 


Weak 


N protein 


28,120-29,388 


422 


1,269 


+1 


Strong 


Orf 13^ 


28,130-28,426 


98 


297 


+2 


Absent^ 


Orf 14' 


28,583 - 28,795 


70 


213 


+2 


Absent 


s2m motif 


29,590-29,621 


N/A 


30 


N/A 


N/A 



1. End coordinates include the stop codon, except for ORF la and s2m. 
2 These ORFs overlap substantially or completely with other and may share TRSs. 
15 N/A indicates not applicable. 

Various polymorphisms may exist in the SARS virus. In the SARS 29,736-base 
genome sequences (SEQ ID NO: 1 or 2), for example, nucleotides 7904, 16607, 19168, 
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24857, or 26842 may be C or T; or nucleotides 19049, 23205, or 25283 may be G or A, 
and in the SARS 29,751-base genome sequence (SEQ ID NO: 15), for example, 
nucleotides 7919, 16622, 19183, 24872, or 26857 may be C or T; or nucleotides 19064, 
23220, or 25298 may be G or A. In some embodiments, the nucleotide changes may 
5 result in no change in the encoded amino acid, or in a conservative or non-conservative 
change in the encoded amino acid. In some embodiments, a nucleotide change, as 
described herein, at position 7904 or 7919, may result in a A to V amino acid 
substitution, in the Replicase 1 A protein coding region; a change at position 19168 or 
19183 may result in a V to A amino acid substitution, in the Replicase IB protein 

10 coding region; a change at position 23205 or 23220 may result in a A to S amino acid 
substitution (non-conservative change), affecting the Spike glycoprotein coding region; 
a change at position 25283 or 25298 naay result in a R to G anaino acid substitution 
(non-conservative change), affecting ORF3; or a change at position 26842 or 26857 
may result in a S to P amino acid substitution (non-conservative change), affecting the 

15 Nucleocapsid protein coding region, in the SARS 29,736-base (SEQ ID NO: 1 or 2) 
and 29,751-base genome (SEQ ID NO: 15) sequences, respectively. In various 
embodiments, a nucleotide or amino acid sequence including a particular 
polymorphism may be selected, for example, for use in the methods of the invention, or 
may be excluded, for example, from a particular use according to the invention. 

20 Various alternative embodiments of the invention are described below. These 

embodiments include, without limitation, identification and use of SARS virus nucleic 
acid and amino acid sequences for diagnostic or therapeutic uses. 

Diagnosis of SARS virus-related disorders 

25 A SARS virus-related disorder is any disorder that is mediated by the SARS 

virus, or by a nucleic acid molecule or polypeptide derived from the SARS virus. 
Accordingly, SARS virus nucleic acid molecules and polypeptides may be used to 
diagnose and identify a SARS virus-related disorder in a mammal, for example, a 
human or a domestic, farm, wild, or experimental animal. In some embodiments, 

30 SARS virus nucleic acid molecules and polypeptides may be used to screen such 
animals, e.g., civet cats, for the presence of SARS virus. A SARS virus-related 
disorder may be a hepatic, enteric, respiratory, or neurological disorder, and may be 



wo 2004/096842 



PCT/CA2004/000626 



33 

accompanied by one or more symptoms or indications including, but not limited to, 
fever, cough, shortness of breath, headache, low blood oxygen concentration, liver 
damage, or reduced lymphocyte numbers. Accordingly, samples for diagnosis may be 
obtained from cells, blood, serum, plasma, urine, stool, conjunctiva, sputum, 
5 asopharyngeal or oropharyngeal swabs, tracheal aspirates, bronchalveolar lavage, 

pleural fluid, amniotic fluid, or any other specimen, or any extract thereof, or by tissue 
biopsy of for example lungs or major organs, obtained from a patient (human or 
animal), test subject, or experimental animal. 

A SARS virus-related disorder may be diagnosed by amplifying a S ARS 

10 nucleic acid molecule or fragment thereof from a sample. Probes or primers for use in 
amplification may be prepared using standard techniques. In some embodiments, 
probes or primers are selected from regions of a SARS virus genome as described 
herein that show limited sequence homology or identity (e.g., less than 10%, 20%, 
30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% identity) to other viruses or 

15 pathogens, or to host sequences. 

Nucleic acid sequences can be amplified as needed by methods known in the 
art. For example, this can be accomplished by e.g., polymerase chain reaction "PCR" of 
DNA or of RNA by reverse transcriptase-PCR or "RT-PCR" (See generally PCR 
Technology: Principles and Applications for DNA Amplification (ed. H. A. Erlich, 

20 Freeman Press, NY, N.Y., 1992); PCR Protocols: A Guide to Methods and 

Applications (eds. Innis, et al.. Academic Press, San Diego, Calif., 1990); Mattila et al.. 
Nucleic Acids Res. 19, 4967 (1991); Eckert et al., PCR Methods and Applications 1, 17 
(1991); PCR (eds. McPherson et al., IRL Press, Oxford); and U.S. Pat. No. 4,683,202 
issued July 28, 1987 to MuUis) Variations of standard PCR techniques, such as for 

25 example real time RT-PCR using internal as well as amplification primers, resulting in 
increased sensitivity and speed, and reduction of risk of sample contamination (see for 
example Higuchi, R., et al., "Kinetic PCR Analysis: Real-time Monitoring of DNA 
Amplification Reactions," Bio/Technology, vol. 11, pp. 1026-1030 (1993); Held et al, 
"Real Time Quantitative PCX", Genome Research, 1996, pp. 986-994; Gibson UE et 

30 al., "A novel method for real time quantitative RT-PCR," Genome Res. 1996 

Oct;6(10):995-1001), or the "Tacman" approach to PCR, described by for example 
Holland et al, Proc. Natl. Acad. Sci., 88: 7276-7280 (1991), may be performed. 
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Other suitable amplification and analytical methods include the single base 
primer extension (see for example U.S. Patent No. 6,004,744), mini-sequencing, ligase 
chain reaction (LCR) (see for example Wu and Wallace, Genomics 4, 560 (1989), 
Landegren et al., Science 241, 1077 (1988), transcription amplification (Kwoh et aL, 
5 Proc. Natl. Acad. Sci. USA 86, 1173 (1989)), and self-sustained sequence replication 
(Guatelli et al., Proc. Nat. Acad. Sci. USA, 87, 1874 (1990)) and nucleic acid based 
sequence amplification (NASBA). The latter two amplification methods involve 
isothermal reactions based on isothermal transcription, which produce both single 
stranded RNA (ssRNA) and double stranded DNA (dsDNA) as the amplification 

10 products in a ratio of about 30 or 100 to 1, respectively. 

A S ARS virus-related disorder may also be diagnosed using an antibody 
directed against a SARS vims nucleic acid or amino acid sequence that specifically 
binds a nucleic acid molecule or polypeptide. In an alternative embodiment, the 
antibody may be directed against a SARS polypeptide, for example, the S polypeptide 

15 or fragment thereof that is located on the surface of the SARS virion. Methods for 
preparation of antibodies or for assaying antibody binding are well known in the art. 

Serological diagnosis may included detection of antibodies against a SARS 
virus polypeptide or nucleic acid molecule, e.g., the Nucleocapsid protein, produced in 
response to infection using techniques such as indirect fluorescent antibody testing or 

20 enzyme-linked immunosorbent assays (ELISA). A SARS virus -related disorder may 
also be diagnosed by for example performing in situ probe hybridization studies on 
tissue specimens. 

In some aspects, diagnostic tests as described herein or known to those of skill 
in the art may be performed for SARS virus variants that exhibit increased 
25 pathogenicity, such as strains having redundant sequences. 

In some embodiments, reagents for diagnosis (e.g, probes, primers, antibodies, 
etc.) may be provided in kits which may optionally include instructions for using the 
reagent or may include other reagents for performing the appropriate assay e.g., 
controls, standards, buffers, etc. 

30 

Therapv or Prophvlaxis for SARS virus-related disorders 
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Compounds according to the invention may also be used to provide therapeutics 
or prophylactics for SARS virus-related disorders. Accordingly, such compounds may 
be used to treat a mammal, for example, a human or a domestic, farm, wild, or 
experimental animal that has or is at risk for a SARS virus-related disorder. Such 
5 compounds may include, without limitation, compounds that interfere with SARS virus 
replication, expression of SARS virus proteins, or the ability of the SARS virus to 
infect a host cell. Accordingly, in some embodiments, compounds that act as 
antagonists to SARS virus polypeptides may be used as therapeutics or prophylactics 
for SARS vims related disorders. In some embodiments, purified SARS virus 

10 polypeptides may be used as for example competitive inhibitors to disrupt viral 

function. For example, a Spike protein lacking a functional domain, or having some 
other modification that maintains binding but reduces or eliminates pathogenicity, may 
be used to dismpt viral function. In some embodiments, antibodies that bind SARS 
vims polypeptides or nucleic acid molecules, for example, humanized antibodies, may 

15 be used as therapeutics or prophylactics. 

In some embodiments, the SARS-virus compounds may be used as vaccines, or 
may be used to develop vaccines. For example, peptides derived from portions of 
SARS-virus proteins or polypeptides located on the outside of the virion or cell surface 
may be useful for vaccines or for generation of therapeutic or prophylactic antibodies. 

20 A "vaccine" is a composition that includes materials that elicit a desired 

immune response. A vaccine may select, activate or expand memory B and T cells of 
the immune system to, for example, enable the elimination of infectious agents, such as 
a SARS vims, or a component thereof. In some embodiments, a vaccine includes a 
suitable carrier, such as an adjuvant, which is an agent that acts in a non-specific 

25 manner to increase the immune response to a specific antigen, or to a group of antigens, 
enabling the reduction of the quantity of antigen in any given vaccine dose, or the 
reduction of the frequency of dosage required to generate the desired immune response. 

Vaccines according to the invention may include SARS virus polypeptides and 
nucleic acid molecules described herein, or immunogenic fragments thereof. In some 

30 embodiments, a SARS virus Spike polypeptide, Envelope polypeptide, or membrane 
glycoprotein or fragments thereof may be suitable for vaccine applications. In some 
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embodiments, the vaccines may be multivalent and include one or more epitopes from a 
SARS virus polypeptide or fragment thereof. 

In some embodiments of the invention, a vaccine may include a live or killed 
microorganism e.g., a SARS virus or a component thereof. If a live SARS virus is 
5 used, which may be administered in the form of an oral vaccine, is may contain non- 
revertible genetic alterations (for example, large deletions or insertions in the genomic 
sequence) that reduce or eliminate the virulence of the virus ("attenuated virus")^ but 
not its induction of an immune response. In some embodiments, a live vaccine may 
include an attenuated non-SARS microorganism (e.g, bacteria or vims such as vaccinia 

10 vims) that is capable of expressing a SARS vims polypeptide or immunogenic 

fragment thereof as described herein. In some embodiments, a vaccine may include 
SARS vims polypeptides or nucleic acid molecules having modifications that facilitate 
ease of administration. For example, an indigestible SARS vims polypeptide or nucleic 
acid molecule may be used for oral administration, and a modification that is suitable 

15 for inhalation may be used for administration to the lung. 

A "nucleic acid vaccine" or "DNA vaccine" as used herein, is a nucleic acid 
construct comprising a polynucleotide encoding a polypeptide antigen, particularly an 
antigenic amino acid subsequence identified by methods described herein or known in 
the art. The nucleic acid construct can also include transcriptional promoter elements, 

20 enhancer elements, splicing signals, termination and polyadenylation signals, and other 
nucleic acid sequences. Thus, a nucleic acid vaccine is generally introduced into a 
subject animal using for example one or more DNA plasmids including one or more 
antigen-coding sequences (for example, a SARS vims Envelope polypeptide or 
membrane glycoprotein sequence) that are capable of transfecting cells in vivo and 

25 inducing an immune response (see for example Whalen RG et al. DNA-mediated 

immunization and the energetic immune response to hepatitis B surface antigen. Clin 
Immunol Immunopathol 1995;75:1-12; Wolff JA et al. Direct gene transfer into mouse 
muscle in vivo. Science 1990;247:1465-8; Fynan EF et al. DNA vaccines: protective 
immunizations by parental, mucosal, and genegun inoculations. Proc Natl Acad Sci 

30 USA 1993; 90: 1 1478-82). In some embodiments, a library of nucleic acid fragments 
may be prepared by cloning SARS vims genomic DNA into a plasmid expression 
vector using known techniques and the library then used as a nucleic acid vaccine (see 
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for example Barry MA, et al. Protection against mycoplasma infection using 
expression-library immunization. Nature 1995;377:632-5). 

The subject is adndnistered the nucleic acid vaccine using standard methods. 
The vertebrate can be administered parenterally, subcutaneously, intravenously, 
5 intraperitoneally, intradermally, intramuscularly, topical^, orally, rectally, nasally, 
buccally, vaginally, by inhalation spray, or via an implanted reservoir in dosage 
formulations containing conventional non-toxic, ph3^siologically acceptable carriers or 
vehicles. Alternatively, the subject is administered the nucleic acid vaccine through the 
use of a particle acceleration or bombardment instrument (a "gene gun"). The form in 

10 which it is administered (e.g., capsule, tablet, solution, emulsion) will depend in part on 
the route by which it is administered. For example, for mucosal administration, nose 
drops, inhalants or suppositories can be used. The nucleic acid vaccine can be 
administered in conjunction with known adjuvants. The adjuvant is administered in a 
sufficient amount, which is that amount that is sufficient to generate an enhanced 

15 immune response to the nucleic acid vaccine. The adjuvant can be administered prior to 
(e.g., 1 or more days before) inoculation with the nucleic acid vaccine; concurrently 
with (e.g., within 24 hours of) inoculation with the nucleic acid vaccine; 
contemporaneously (simultaneously) with the nucleic acid vaccine (e.g., the adjuvant is 
mixed with the nucleic acid vaccine, and the mixture is administered to the vertebrate); 

20 or after (e.g., 1 or more days after) inoculation with the nucleic acid vaccine. The 

adjuvant can also be administered at more than one time (e.g., prior to inoculation with 
the nucleic acid vaccine and also after inoculation with the nucleic acid vaccine). As 
used herein, the term "in conjunction with" encompasses any time period, including 
those specifically described herein and combinations of the time periods specifically 

25 described herein, during which the adjuvant can be administered so as to generate an 

enhanced immune response to the nucleic acid vaccine (e.g., an increased antibody titer 
to the antigen encoded by the nucleic acid vaccine, or an increased antibody titer to the 
pathogenic agent). The adjuvant and the nucleic acid vaccine can be administered at 
approximately the same location on the vertebrate; for example, both the adjuvant and 

30 the nucleic acid vaccine are administered at a marked site on a limb of the subject. 

In some embodiments, expression of a S ARS virus gene or coding or non- 
coding region of interest may be inhibited or prevented using RNA interference (RNAi) 
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technology, a type of post-transcriptional gene silencing. RNAi may be used to create a 
functional "knockout", i.e. a system in which the expression of a gene or coding or non- 
coding region of interest is reduced, resulting in an overall reduction of the encoded 
product. As such, RNAi may be performed to target a nucleic acid of interest or 
5 fragment or variant thereof, to in turn reduce its expression and the level of activity of 
the product which it encodes. Such a system may be used for therapy or prophylaxis, 
as well as for functional studies. RNAi is described in for example published US patent 
applications 20020173478 (Gewirtz; published November 21, 2002) and 20020132788 
(Lewis et aL; published November 7, 2002). Reagents and kits for performing RNAi 

10 are available commercially from for example Ambion Inc. (Austin, TX, USA) and New 
England Biolabs Inc. (Beverly, MA, USA). 

The initial agent for RNAi in some systems is thought to be dsRNA molecule 
corresponding to a target nucleic acid. The dsRNA is then thought to be cleaved into 
short interfering RNAs (siRNAs) which are 21-23 nucleotides in length (19-21 bp 

15 duplexes, each with 2 nucleotide 3' overhangs). The enzyme thought to effect this first 
cleavage step has been referred to as "Dicer" and is categorized as a member of the 
Rnase III family of dsRNA-specific ribonucleases. Alternatively, RNAi may be 
effected via directly introducing into the cell, or generating within the cell by 
introducing into the cell a suitable precursor (e.g. vector, etc.) of such an siRNA or 

20 siRNA-like molecule. An siRNA may then associate with other intracellular 

components to form an RNA-induced silencing complex (RISC). The RISC thus 
formed may subsequently target a transcript of interest via base-pairing interactions 
between its siRNA component and the target transcript by virtue of homology, resulting 
in the cleavage of the target transcript approximately 12 nucleotides from the 3' end of 

25 the siRNA. Thus the target mRNA is cleaved and the level of protein product it 
encodes is reduced. 

RNAi may be effected by the introduction of suitable in vitro synthesized 
siRNA or siRNA-like molecules into cells. RNAi may for example be performed using 
chemically-synthesized RNA, for which suitable RNA molecules may chemically 

30 synthesized using known methods. Alternatively, suitable expression vectors may be 
used to transcribe such RNA either in vitro or in vivo. In vitro transcription of sense 
and antisense strands (encoded by sequences present on the same vector or on separate 
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vectors) may be effected using for example T7 RNA polymerase, in which case the 
vector may comprise a suitable coding sequence operably-linked to a T7 promoter. The 
in v//TO-transcribed RNA may in embodiments be processed (e.g. using E. coli RNase 
III) in vitro to a size conducive to RNAi. The sense and antisense transcripts combined 
5 to form an RNA duplex which is introduced into a target cell of interest. Other vectors 
may be used, which express small hairpin RNAs (shRNAs) which can be processed 
into siRNA-like molecules. Various vector-based methods are known in the art. 
Various methods for introducing such vectors into cells, either in vitro or in vivo (e.g. 
gene therapy) are known in the art. 

10 Accordingly, in an embodiment, expression of a polypeptide including an amino 

acid sequence substantially identical to a S ARS virus sequence may be inhibited by 
introducing into or generating within a cell an siRNA or siRNA-like molecule 
corresponding to a nucleic acid molecule encoding the polypeptide or fragment thereof, 
or to an nucleic acid homologous thereto. In various embodiments such a method may 

15 entail the direct administration of the siRNA or siRNA-like molecule into a cell, or use 
of the vector-based methods described above. In an embodiment, the siRNA or 
siRNA-like molecule is less than about 30 nucleotides in length. In a further 
embodiment, the siRNA or siRNA-like molecules are about 21-23 nucleotides in 
length. In an embodiment, siRNA or siRNA-like molecules comprise and 19-21 bp 

20 duplex portion, each strand having a 2 nucleotide 3' overhang. In embodiments, the 
siRNA or siRNA-like molecule is substantially identical to a nucleic acid encoding the 
polypeptide or a fragment or variant (or a fragment of a variant) thereof. Such a variant 
is capable of encoding a protein having the activity of a S ARS virus polypeptide. In 
embodiments, the sense strand of the siRNA or siRNA-like molecule is substantially 

25 identical to a SARS vims nucleic acid molecule or a fragment thereof (RNA having U 
in place of T residues of the DNA sequence). 

SARS Vims Protein Expression 

In general, SARS virus polypeptides according to the invention, may be 
30 produced by transformation of a suitable host cell with all or part of a SARS virus 
polypeptide-encoding genomic or cDNA molecule or fragment thereof (e.g., the 
genomic DNA or cDNAs described herein) in a suitable expression vehicle. Those 
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skilled in the field of molecular biology will understand that any of a wide variety of 
expression systems may be used to provide the recombinant protein. The precise host 
cell used is not critical to the invention. The SARS virus polypeptide may be produced 
in a prokaryotic host (e.g., E. coli or a virus, for example, a coronovirus such as human 
5 OC43 or 229E, a bovine coronavirus, or a virus used for gene therapy, such as an 
adenovirus) or in a eukaryotic host (e.g., Saccharomyces cerevisiae, insect cells, e.g., 
Sf21cells, or mammalian cells, e.g., COS 1, NIH 3T3, VeroE6, or HeLa cells). Such 
cells are available from a wide range of sources (e.g., the American Type Culture 
Collection, Rockland, Md.; also, see, e.g., Ausubel et al.. Current Protocols in 

10 Molecular Biology, John Wiley & Sons, New York, 1994). The method of 

transformation or transfection and the choice of expression vehicle wiU depend on the 
host system selected. Transformation and transfection methods are described, e.g., in 
Ausubel et al. (supra); expression vehicles may be chosen from those provided, e.g., in 
Cloning Vectors: A Laboratory Manual, P. H. Pouwels et al, 1985, Supp. 1987), or 

15 from commercially available sources. Suitable animal models, e.g. a ferret animal 
model, or any other animal model suitable for analysis of SARS virus infection or 
expression of SARS virus nucleic acid molecules may be used. 

In an alternative embodiment, the baculovirus expression system (using, for 
example, the vector pBacPAK9) available from Clontech (Pal Alto, Calif.) may be 

20 used. If desired, this system may be used in conjunction with other protein expression 
techniques, for example, the myc tag approach described by Evan et al. (Mol. Cell Biol. 
5:3610-3616, 1985). In an alternative embodiment, a SARS vims polypeptide may be 
produced by a stably-transfected mammalian cell line, A number of vectors suitable for 
stable transfection of mammalian cells are available to the public, e.g., see Pouwels et 

25 al (supra); methods for constmcting such cell lines are also publicly available, e.g., in 
Ausubel et al. (supra). In one example, cDNA encoding the SARS virus polypeptide is 
cloned into an expression vector which includes the dihydrofolate reductase (DHFR) 
gene. Integration of the plasmid and, therefore, the SARS virus polypeptide-encoding 
gene into the host cell chromosome is selected for by inclusion of 0.01-300 jXM 

30 methotrexate in the cell culture medium (as described in Ausubel et al., supra). This 
dominant selection can be accomplished in most cell types. Recombinant protein 
expression can be increased by DHFR-mediated amplification of the transfected gene. 
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Methods for selecting cell lines bearing gene amplifications are described in Ausubel et 
al. (supra); such methods generally involve extended culture in medium containing 
gradually increasing levels of methotrexate. DHFR-containing expression vectors 
commonly used for this purpose include pCVSEII-DHFR and pAdD26SV(A) 
5 (described in Ausubel et aL, supra). Any of the host cells described above or, 

preferably, a DHFR-deficient CHO cell Una (e.g., CHO DHFR.sup.- cells, ATCC 
Accession No. CRL 9096) are among the host cells preferred for DHFR selection of a 
stably-transfected cell line or DHFR-mediated gene amplification. 

Once the recombinant SARS virus polypeptide is expressed, it is isolated, e.g., 

10 using affinity chromatography. In one example, an anti-SARS virus polypeptide 

antibody (e.g., produced as described herein) may be attached to a column and used to 
isolate the SARS vims polypeptide. Lysis and fractionation of SARS viras polypeptde- 
harboring cells prior to affinity chromatography may be performed by standard 
methods (see, e.g., Ausubel et al., supra). In another example, SARS virus polypeptides 

15 may be purified or substantially purified from a mixture of compounds such as an 

extract or supernatant obtained from cells (Ausubel et al., supra). Standard purification 
techniques can be used to progressively eliminate undesirable compounds from the 
mixture until a single compound or minimal number of effective compounds has been 
isolated. 

20 Once isolated, the recombinant protein can, if desired, be further purified, e.g., 

by high performance liquid chromatography (see, e.g.. Fisher, Laboratory Techniques 
In Biochemistry And Molecular Biology, eds., Work and Burdon, Elsevier, 1980). 

Polypeptides of the invention, particularly short SARS vims peptide fragments, 
can also be produced by chemical synthesis (e.g., by the methods described in Solid 
25 Phase Peptide Synthesis, 2nd ed., 1984 The Pierce Chemical Co., Rockford, Ul.). 

These general techniques of polypeptide expression and purification can also be 
used to produce and isolate useful SARSvirus protein fragments or analogs (described 
herein). 

In certain alternative embodiments, the SARS polypeptide might have attached 
30 any one of a variety of tags. Tags can be amino acid tags or chemical tags and can be 
added for the purpose of purification (for example a 6-histidine tag for purification over 
a nickel column). In other preferred embodiments, various labels can be used as means 
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for detecting binding of a S ARS polypeptide to another polypeptide, for example to a 
cell surface receptor. Alternatively, S ARS DNA or RNA may be labeled for detection, 
for example in a hybridization assay. S ARS virus nucleic acids or proteins, or 

derivatives thereof, may be directly or indirectly labeled, for example, with a 
5 radioscope, a fluorescent compound, a bioluminescent compound, a chemiluminescent 
compound, a metal chelator or an enzyme. Those of ordinary skill in the art will know 
of other suitable labels or will be able to ascertain such, using routine experimentation. 
In yet another embodiment of the invention, die polypeptides disclosed herein, or 
derivatives thereof, are linked to toxins. 

10 

Isolation and Identification of Additional S ARS virus molecules 

Based on the SARS virus sequences described herein, the isolation and 
identification of additional SARS virus-related sequences such as SARS virus genes 
and of additional SARS virus strains or isolates is made possible using standard 

15 techniques. In addition, the SARS viras sequences provided herein also provide the 
basis for identification of homologous sequences from other species and genera from 
both prokaryotes and eukaryotes such as viruses, bacteria, fungi, parasites, yeast, and/or 
mamLmals.In some embodiments, the nucleic acid sequences described herein may be 
used to design probes or primers, including degenerate oligonucleotide probes or 

20 primers, based upon the sequence of either DNA strand. The probes or primers may 
then be used to screen genomic or cDNA libraries for sequences from for example 
naturally occurring variants or isolates of SARS vimses, using standard amplification 
or hybridization techniques. 

In some embodiments, binding partners may be identified by tagging the 

25 polypeptides of the invention (e.g., those substantially identical to SARS vims 

polypeptides described herein) with an epitope sequence (e.g., FLAG or 2HA), and 
delivering it into host cells, either by transfection with a suitable vector containing a 
nucleic acid sequence encoding a polypeptide of the invention, followed by 
immunoprecipitation and identification of the binding partner. Cells may be infected 

30 with strains expressing the FLAG or 2HA fusions, followed by lysis and 

immunoprecipitation with anti-FLAG or anti-2HA antibodies. Binding partners may be 
identified by mass spectroscopy . If the polypeptide of the invention is not produced in 
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sufficient quantities, such a method may not deliver enough tagged protein to identify 
its partner. As part of a complementary approach, each polypeptide of the invention 
may be cloned into a mammalian transfection vector fused to, for example, 2HA, GFP 
and/or FLAG. Following transfection, HeLa cells may be lysed and the tagged 
5 polypeptide immunoprecipitated. The binding partner may be identified by SDS PAGE 
followed by mass spectroscopy. 

In some embodiments, polypeptides or antibodies of the invention may be 
tagged, produced, and used for example on affinity columns and/orin immunological 
assays to identify and/or confirm identified target compounds. FLAG, HA, and/or His 

10 tagged proteins can be used for such affinity columns to pull out host cell factors from 
cell extracts, and any hits may be validated by standard binding assays, saturation 
curves, and other methods as described herein or known to those of skill in the art. 

In some embodiments, a two hybrid system may be used to study protein- 
protein interactions. The nucleic acid sequences described herein, or sequences 

15 substantially identical thereto, can be cloned into the pBT bait plasmid of the two 
hybrid system, and a commercially available murine spleen library of 5 x 10^ 
independent clones, may be used as the target library for the baits. Potential hits may 
be further characterized by recovering the plasmids and retransforming to reduce false 
positives resulting from clonal bait variants and library target clones which activate the 

20 reporter genes independent of the cloned bait. Reproducible hits may be studied further 
as described herein. 

Virulence may be assayed as described herein or as known to those of skill in the art. 
Once coding sequences have been identified, they may be isolated using standard 
cloning techniques, and inserted into any suitable vector or replicon for, for example, 

25 production of polypeptides. Such vectors and replicons include, without linaitation, 
bacteriophage X (E. coli), pBR322 (E. coli), pACYC177 (E. coli), pKT230 (gram- 
negative bacteria), pGVl 106 (gram-negative bacteria), pLAFRl (gram-negative 
bacteria), pME290 (non-E. coli gram-negative bacteria), pHV14 (E. coli and Bacillus 
subtilis), pBD9 (Bacillus), pIJ61 (Streptomyces), pUC6 (Streptomyces), YIp5 

30 (Saccharomyces), YCpl9 (Saccharomyces) or bovine papilloma virus (mammalian 
cells). In general, the polypeptides of the invention may be produced in any suitable 
host cell transformed or transfected with a suitable vector. The method of 
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transformation or transfection and the choice of expression vehicle will depend on the 
host system selected. A wide variety of expression systems may be used, and the 
precise host cell used is not critical to the invention. For example, a polypeptide 
according to the invention may be produced in a prokaryotic host (e.g., E. coli) or in a 
5 eukaryotic host (e.g., Saccharomyces cerevisiae, insect cells, e.g., Sf21 cells, or 

mammalian cells, e.g., NIH 3T3, HeLa, or COS cells). Such cells are available from a 
wide range of sources (e.g., the American Type Culture Collection, Manassus, VA.). 
Bacterial expression systems for polypeptide production include the E. coli pET 
expression system (Novagen, Inc., Madison, Wis.), and the pGEX expression 
10 system(Pharmacia). 

Compounds 

In one aspect, compounds according to the invention include SARS virus 
nucleic acid molecules and polypeptides, such as the sequences disclosed in the Figures 

15 and Tables herein, and throughout the specification, and fragments thereof. In 

alternative embodiments, compounds according to the invention may be nucleic acid 
molecules that are at least 10 nucleotides in length, and that are derived from the 
sequences described herein. In alternative embodiments, compounds according to the 
invention may be peptides that are at least 5 amino acids in length, and that are derived 

20 from the sequences described herein. 

In alternative embodiments, a compound according to the invention can be a 
non-peptide molecule as well as a peptide or peptide analogue. A peptide or peptide 
analogue will generally be as small as feasible while retaining fuU biological activity. 
A non-peptide molecule can be any molecule that exhibits biological activity as 

25 described herein or known in the art. Biological activity can, for example, be measured 
in terms of ability to elicit a cytotoxic response, to mediate DNA replication, or any 
other function of a SARS virus molecule. 

Compounds can be prepared by, for example, replacing, deleting, or inserting an 
amino acid residue of SARS peptide or peptide analogue, as described herein, with 

30 other conservative amino acid residues, i.e., residues having similar physical, 
biological, or chemical properties, and screening for biological function. 
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It is well known in the art that some modifications and changes can be made in 
the structure of a polypeptide without substantially altering the biological function of 
that peptide, to obtain a biologically equivalent polypeptide. Such modifications may 
be made for the purpose of modifying function, or for facilitating administration or 
5 enhancing stability or inhibiting breakdown for, for example, therapeutic uses. For 

example, an indigestible SARS virus compound according to the invention may be used 
for oral administration; a modification that is suitable for inhalation may be used for 
administration to the lung; or addition of a leader sequence may increase protein 
expression levels. 

10 In one aspect of the invention, SARS virus-derived peptides or epitopes may 

include peptides that differ from a portion of a native leader, protein or SARS vims 
sequence by conservative amino acid substitutions. The peptides and epitopes of the 
present invention also extend to biologically equivalent peptides that differ from a 
portion of the sequence of novel peptides of the present invention by conservative 

15 amino acid substitutions. As used herein, the term "conserved amino acid 

substitutions" refers to the substitution of one amino acid for another at a given location 
in the peptide, where the substitution can be made without substantial loss of the 
relevant function. In making such changes, substitutions of like amino acid residues can 
be made on the basis of relative similarity of side-chain substituents, for example, their 

20 size, charge, hydrophobicity, hydrophilicity, and the like, and such substitutions may be 
assayed for their effect on the function of the peptide by routine testing. 

In some embodiments, conserved amino acid substitutions may be made where 
an anaino acid residue is substituted for another having a similar hydrophilicity value 
(e.g., within a value of plus or minus 2.0), where the following may be an amino acid 

25 having a hydropathic index of about -1,6 such as Tyr (-1.3) or Pro (-1.6)s are assigned 
to amino acid residues (as detailed in United States Patent No. 4,554,101, incorporated 
herein by reference): Arg (+3,0); Lys (+3.0); Asp (+3.0); Glu (+3.0); Ser (+0.3); Asn 
(+0.2); Gin (+0.2); Gly (0); Pro (-0.5); Thr (-0.4); Ala (-0.5); His (-0.5); Cys (-1.0); Met 
(-1.3); Val (-1.5); Leu (-1.8); lie (-1,8); Tyr (-2.3); Phe (-2.5); and Trp (-3.4). 

30 In alternative embodiments, conserved amino acid substitutions may be made 

where an amino acid residue is substituted for another having a similar hydropathic 
index (e.g., within a value of plus or minus 2.0). In such embodiments, each amino acid 



wo 2004/096842 



PCT/CA2004/000626 



residue may be assigned a hydropathic index on the basis of its hydrophobicity and 
charge characteristics, as follows: lie (+4.5); Val (+4.2); Leu (+3.8); Phe (+2.8); Cys 
(+2.5); Met (+L9); Ala (+1.8); Gly (-0.4); Thr (-0.7); Ser (-0.8); Trp (-0.9); Tyr (-1.3); 
Pro (-1.6); His (-3.2); Glu (-3.5); Gin (-3.5); Asp (-3.5); Asn (-3.5); Lys (-3.9); and Arg 
5 (-4.5). 

In alternative embodiments, conserved amino acid substitutions may be made 
where an amino acid residue is substituted for another in the same class, where the 
amino acids are divided into non-polar, acidic, basic and neutral classes, as follows: 
non-polar: Ala, Val, Leu, He, Phe, Trp, Pro, Met; acidic: Asp, Glu; basic: Lys, Arg, 

10 His; neutral: Gly, Ser, Thr, Cys, Asn, Gin, Tyr. 

Conservative amino acid changes can include the substitution of an L-amino 
acid by the corresponding D-amino acid, by a conservative D-amino acid, or by a 
naturally-occurring, non-genetically encoded form of amino acid, as well as a 
conservative substitution of an L-amino acid. Naturally-occurring non-genetically 

15 encoded amino acids include beta-alanine, 3-amino-propionic acid, 2,3-dianGino 

propionic acid, alpha-aminoisobutyric acid, 4-amino-butyric acid, N-methylglycine 
(sarcosine), hydroxyproline, ornithine, citrulline, t-butylalanine, t-butylglycine, N- 
methylisoleucine, phenylglycine, cyclohexylalanine, norleucine, norvaline, 2- 
napthylalanine, pyridylalanine, 3-benzothienyl alanine, 4~chlorophenylalanine, 2- 

20 fluorophenylalanine, 3-fluorophenylalanine, 4-fluorophenylalanine, penicillamine, 
l,2,3,4-tetrahydro-isoquinoline-3-carboxylix acid, beta-2-thienylalanine, methionine 
sulfoxide, homoarginine, N-acetyl lysine, 2-amino butyric acid, 2-amino butyric acid, 
2,4,-diamino butyric acid, p-aminophenylalanine, N-methylvaline, homocysteine, 
homoserine, cysteic acid, epsilon-amino hexanoic acid, delta-antuno valeric acid, or 2,3- 

25 diaminobutyric acid. 

In alternative embodiments, conservative amino acid changes include changes 
based on considerations of hydrophilicity or hydrophobicity, size or volume, or charge. 
Amino acids can be generally characterized as hydrophobic or hydrophilic, depending 
primarily on the properties of the amino acid side chain. A hydrophobic amino acid 

30 exhibits a hydrophobicity of greater than zero, and a hydrophilic amino acid exhibits a 
hydrophilicity of less than zero, based on the normalized consensus hydrophobicity 
scale of Eisenberg etal (7. Mol Bio. 179:125-142, 184). Genetically encoded 
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hydrophobic amino acids include Gly, Ala, Phe, Val, Leu, He, Pro, Met and Trp, and 
genetically encoded hydrophilic amino acids include Thr, His, Glu, Gin, Asp, Arg, Ser, 
and Lys. Non-genetically encoded hydrophobic amino acids include t-butylalanine, 

while non-genetically encoded hydrophilic amino acids include citruUine and 
5 homocysteine. 

Hydrophobic or hydrophilic amino acids can be further subdivided based on the 
characteristics of their side chains. For example, an aromatic amino acid is a 
hydrophobic amino acid with a side chain containing at least one aromatic or 
heteroaromatic ring, which may contain one or more substituents such as -OH, -SH, - 

10 CN, -F, -CI, -Br, -I, -NO2, -NO, -NH2, -NHR, -NRR, -C(0)R, ^C(0)OH, -C(0)OR, - 
C(0)NH2, -C(0)NHR, -C(0)NRR, etc., where R is independently (Ci-Ce) aUcyl, 
substituted (Ci-Ce) aUcyl, (Ci-Ce) alkenyl, substituted (Ci-Ce) alkenyl, (Ci-Ce) 
alkynyl, substituted (Ci-Ce) alkynyl, (C5-C20) aryl, substituted (C5-C20) aryl, (C6-C26) 
alkaryl, substituted (C6-C26) alkaryl, 5-20 membered heteroaryl, substituted 5-20 

15 membered heteroaryl, 6-26 membered alkheteroaryl or substituted 6-26 membered 

alkheteroaryl. Genetically encoded aromatic amino acids include Phe, Tyr, and Tryp, 
while non-genetically encoded aromatic amino acids include phenylglycine, 2- 
napthylalanine, beta~2-thienylalanine, 1 ,2,3,4-tetrahydro-isoquinoline-3-carboxylic 
acid, 4-chlorophenylalanine, 2-£luorophenylalanine3-fluorophenylalanine, and 4- 

20 fluorophenylalanine. 

An apolar amino acid is a hydrophobic amino acid with a side chain that is 
uncharged at physiological pH and which has bonds in which a pair of electrons shared 
in common by two atoms is generally held equally by each of the two atoms (i.e., the 
side chain is not polar). Genetically encoded apolar amino acids include Gly, Leu, Val, 

25 lie, Ala, and Met, while non-genetically encoded apolar amino acids include 

cyclohexylalanine. Apolar amino acids can be further subdivided to include aliphatic 
amino acids, which is a hydrophobic amino acid having an aliphatic hydrocarbon side 
chain. Genetically encoded aliphatic amino acids include Ala, Leu, Val, and He, while 
non-genetically encoded aliphatic amino acids include norleucine. 

30 A polar amino acid is a hydrophilic amino acid with a side chain that is 

uncharged at physiological pH, but which has one bond in which the pair of electrons 
shared in common by two atoms is held more closely by one of the atoms. Genetically 
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encoded polar amino acids include Ser, Thr, Asn, and Gin, while non-genetically 
encoded polar amino acids include citrulline, N-acetyl lysine, and methionine 
sulfoxide. 

An acidic amino acid is a hydrophilic amino acid with a side chain pKa value of 
5 less than 7. Acidic amino acids typically have negatively charged side chains at 
physiological pH due to loss of a hydrogen ion. Genetically encoded acidic amino 
acids include Asp and Glu. A basic amino acid is a hydrophilic amino acid with a side 
chain pKa value of greater than 7. Basic amino acids typically have positively charged 
side chains at physiological pH due to association with hydronium ion. Genetically 

10 encoded basic amino acids include Arg, Lys, and His, while non-genetically encoded 
basic amino acids include the non-cyclic amino acids ornithine, 2,3,-diaminopropionic 
acid, 2,4-diaminobutyric acid, and homoarginine. 

It will be appreciated by one skilled in the art that the above classifications are 
not absolute and that an amino acid may be classified in more than one category. In 

15 addition, amino acids can be classified based on known behaviour and or characteristic 
chemical, physical, or biological properties based on specified assays or as compared 
with previously identified amino acids. Amino acids can also include bifunctional 
moieties having amino acid-like side chains. 

Conservative changes can also include the substitution of a chemically 

20 derivatised moiety for a non-derivatised residue, by for example, reaction of a 
functional side group of an amino acid. Thus, these substitutions can include 
compounds whose free andno groups have been derivatised to amine hydrochlorides, p- 
toluene sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl groups, cMoroacetyl 
groups or formyl groups. Similarly, free carboxyl groups can be derivatized to form 

25 salts, methyl and ethyl esters or other types of esters or hydrazides, and side chains can 
be derivatized to form O-acyl or O-alkyl derivatives for free hydroxyl groups or N-im- 
benzylhistidine for the imidazole nitrogen of histidine. Peptide analogues also include 
amino acids that have been chemically altered, for example, by methylation, by 
amidation of the C-terminal amino acid by an alkylamine such as ethylamine, 

30 ethanolamine, or ethylene diamine, or acylation or methylation of an amino acid side 
chain (such as acylation of the epsilon amino group of lysine). Peptide analogues can 
also include replacement of the amide linkage in the peptide with a substituted amide 
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(for example, groups of the formula -C(0)-NR, where R is (Ci-Ca) alkyl, (Ci-Ce) 
alkenyl, (Ci-Ce) alkynyl, substituted (Ci-Ce) alkyl, substituted (Ci-Ce) alkenyl, or 
substituted (Ci-Ce) alkynyl) or isostere of an amide linkage (for example, -CH2NH-, - 
CH2S, -CH2CH2-, -CH=CH- (cis and trans), -C(0)CH2-, ~CH(OH)CH2-, or-CHsSO-). 
5 The compound can be covalently linked, for example, by polymerisation or 

conjugation, to form homopolymers or heteropolymers. Spacers and linkers, typically 
composed of small neutral molecules, such as amino acids that are uncharged under 
physiological conditions, can be used. Linkages can be achieved in a number of ways. 
For example, cysteine residues can be added at the peptide termini, and multiple 

10 peptides can be covalently bonded by controlled oxidation. Alternatively, 

heterobifunctional agents, such as disulfide/amide forming agents or thioether/amide 
forming agents can be used. The compound can also be constrained, for example, by 
having cyclic portions. 

In some embodiments, three dimensional molecular modeling techniques may 

15 be used to identify or generate compounds that may be useful as therapeutics or 
diagnostics. Standard molecular modeling tools may be used, for example, those 
described in L~H Hung and R. Samudrala, PROTINFO: secondary and tertiary protein 
structure prediction. Nucleic Acids Research, 2003, Vol. 31, No. 13 3296-3299; A. 
Yamaguchi, et al. , Enlarged FAMSBASE: protein 3D structure models of genome 

20 sequences for 41 species. Nucleic Acids Research, 2003, Vol. 31, No. 1 463-468; J. 
Chen, et al., MMDB: Entrez's 3D-structure database,Nucleic Acids Research, 2003, 
Vol. 31, No. 1 474-477; R. A. Chiang, et al.. The Stracture Superposition Database, 
Nucleic Acids Research, 2003, Vol. 31, No. 1 505-510. 

Peptides or peptide analogues can be synthesized by standard chemical 

25 techniques, for example, by automated synthesis using solution or solid phase synthesis 
methodology. Automated peptide synthesizers are conmiercially available and use 
techniques well known in the art. Peptides and peptide analogues can also be prepared 
using recombinant DNA technology using standard methods such as those described in, 
for example, Sambrook, et al (Molecular Cloning: A Laboratory Manual. 2.sup.nd, ed., 

30 Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor, N.Y., 1989) or Ausubel et al. (Current Protocols in Molecular Biology, John 
Wiley & Sons, 1994). 
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Compounds, such as peptides (or analogues thereof) can be identified by routine 
experimentation by, for example, modifying residues within SARS peptides; 
introducing single or multiple amino acid substitutions, deletions, or insertions, and 
identifying those compounds that retain biological activity, e.g., those compounds that 
5 have cytotoxic ability. 

In general, candidate compounds for prevention or treatment of SARS virus- 
mediated disorders are identified from large libraries of both natural product or 
synthetic (or semi-synthetic) extracts or chemical libraries according to methods known 
in the art. Candidate or test compounds may include, without limitation, peptides, 

10 polypeptides, synthesised organic molecules, naturally occurring organic molecules, 
and nucleic acid molecules. In some embodiments, such compounds screen for the 
ability to inhibit SARS virus replication or pathogenicity, while maintaining the 
infected cell's ability to grow or survive. 

Those skilled in the field of drug discovery and development will understand 

15 that the precise source of test extracts or compounds is not critical to the method(s) of 
the invention. Accordingly, virtually any number of chemical extracts or compounds 
can be screened using the exemplary methods described herein or using standard 
methods. Examples of such extracts or compounds include, but are not limited to, plant- 
, fungal-, prokaryotic- or animal-based extracts, fermentation broths, and synthetic 

20 compounds, as well as modification of existing compounds. Numerous methods are 
also available for generating random or directed synthesis (e.g., semi-synthesis or total 
synthesis) of any number of chemical compounds, including, but not limited to, 
saccharide-, lipid-, peptide-, and nucleic acid-based compounds. Synthetic compound 
libraries are commercially available. Alternatively, libraries of natural compounds in 

25 the form of bacterial, fungal, plant, and animal extracts are commercially available 
from a number of sources, including Biotics (Sussex, UK), Xenova (Slough, UK), 
Harbor Branch Oceanographic Institute (Ft. Pierce, Fla.), and PharmaMar, U.S.A. 
(Cambridge, Mass.). In addition, natural and synthetically produced libraries of, for 
example, SARS virus polypeptides containing leader sequences, are produced, if 

30 desired, according to methods known in the art, e.g., by standard extraction and 
fractionation methods. Furthermore, if desired, any library or compound is readily 
modified using standard chemical, physical, or biochemical methods. 
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When a crude extract is found to modulate cytotoxicity or viral infection, 
further fractionation of ttie positive lead extract is necessary to isolate chemical 
constituents responsible for the observed effect. Thus, the goal of the extraction, 
fractionation, and purification process is the careful characterization and identification 
5 of a chemical entity within the crude extract having, for example, anti-cytotoxicity or 
anti- viral properties. The same assays described herein for the detection of activities in 
mixtures of compounds can be used to purify the active component and to test 
derivatives thereof. Methods of fractionation and purification of such heterogenous 
extracts are known in the art. If desired, compounds shown to be useful agents for 
10 treatment are chemically modified according to methods known in the art. Compounds 
identified as being of therapeutic, prophylactic, diagnostic, or other value in for 
example cell culture systems, such as a Vero E6 culture system, may be subsequently 
analyzed using a ferret animal model, or any other animal model suitable for analysis of 
SARS. 

15 

Antibodies 

The compounds of the invention can be used to prepare antibodies to SARS 
virus peptides, protein, polyproteins, or analogs thereof, or to SARS virus nucleic acid 
molecules or analogs thereof using standard techniques of preparation as, for example, 

20 described in Harlow and Lane (Antibodies; A Laboratory Manual, Cold Spring Harbor 
Laboratory, Cold Spring Harbor, N.Y,, 1988), or known to those skilled in the art. 
Antibodies may include polyclonal antibodies, monoclonal antibodies, hybrid 
antibodies (e.g., divalent antibodies having different pairs of heavy and light chains), 
chimeric antibodies (e.g., antibodies having constant and variable domains from 

25 different species and/or class), modified antibodies (e.g, antibodies in which the 

naturally occurring sequence has been altered by for example recombinant techniques), 
Fab antibodies, anti-idiotype antibodies, etc. Antibodies can be tailored to minimise 
adverse host immune response by, for example, using chimeric antibodies containing 
an antigen binding domain from one species and the Fc portion from another species, or 

30 by using antibodies made from liybridomas of the appropriate species. For example, 
"humanized" antibodies may be used for administration to humans. 
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To generate SARS virus polypeptide-specific antibodies, a SARS virus 
polypeptide coding sequence may be expressed, for example, as a C-terminal fusion 
with glutathione S-transferase (GST) (Smith et al., Gene 67:31-40, 1988). The fusion 
polypeptide may then be purified on glutathione-Sepharose beads, eluted with 
5 glutathione cleaved with thrombin (at the engineered cleavage site), and purified to the 
degree necessary for inmiunization of rabbits. Primary immunizations are carried out 
with Freud's complete adjuvant and subsequent immunizations with Freud's incomplete 
adjuvant. Antibody titres are monitored by Western blot and inmiunoprecipitation 
analyzes using the thrombin-cleaved SARS virus polypeptide fragment of the GST- 

10 SARS virus fusion polypeptide. Immune sera are affinity purified using CNBr- 

Sepharose-coupled SARS vims polypeptide. Antiserum specificity is determined using 
a panel of unrelated GST polypeptides. 

As an alternate or adjunct immunogen to GST fusion polypeptides, peptides 
corresponding to relatively xmique hydrophilic SARS viras polypeptides may be 

15 generated and coupled to keyhole limpet hemocyanin (KLH) through an introduced C- 
terminal lysine. Antiserum to each of these peptides is similarly affinity purified on 
peptides conjugated to BSA, and specificity tested in ELISA and Western blots using 
peptide conjugates, and by Western blot and iminunoprecipitation using SARS virus 
polypeptide expressed as a GST fusion polypeptide. 

20 Alternatively, monoclonal antibodies may be prepared using the SARS viras 

polypeptides described above and standard hybridoma technology (see, e.g., Kohler et 
al.. Nature, 256:495, 1975; Kohler et al., Eur. J Immunol. 6:511, 1976; Kohler et al., 
Eur. J. Immunol. 6:292, 1976; Hammerling et al.. In Monoclonal Antibodies and T Cell 
Hybridomas, Elsevier, NY, 1981; Ausubel et al., supra). Once produced, monoclonal 

25 antibodies are also tested for specific SARS virus polypeptide recognition by Western 
blot or inamunoprecipitation analysis (by the methods described in Ausubel et al., 
supra). Antibodies which specifically recognize SARS virus polypeptides are 
considered to be useful in the invention; such antibodies may be used, e.g., in an 
immunoassay to monitor the level of SARS virus polypeptides produced by a manamal 

30 (for example, to determine the amount or location of a SARS virus polypeptide). 

In an alternative embodiment, antibodies of the invention are not only produced 
using the whole SARS virus polypeptide, but using fragments of the SARS virus 
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polj^eptide which are unique or which lie outside highly conserved regions and appear 
likely to be antigenic, by criteria such as high frequency of charged residues may also 
be used. In one specific example, such fragments are generated by standard techniques 

of PCR and cloned into the pGEX expression vector (Ausubel et al., supra). Fusion 
5 polypeptides are expressed in E. coli and purified using a glutathione agarose affinity 
matrix as described in Ausubel et al. (supra). To attempt to minimize the potential 
problems of low affinity or specificity of antisera, two or three such fusions are 
generated for each polypeptide, and each fusion is injected into at least two rabbits. 
Antisera are raised by injections in a series, preferably including at least three booster 
10 injections. SARS virus antibodies may also be prepared against SARS virus nucleic 
acid molecules. 

Antibodies may be used as diagnostics, therapeutics, or prophylactics for SARS 
virus-related disorders. Antibodies may also be used to isolate SARS vims and 
compounds by for example affinity chromatography, or to identify SARS virus 
15 compounds isolated or generated by other techniques. 

Arrays and Libraries 

In some aspects, biological assays, such as diagnostic or other assays, using 
high density nucleic acid, polypeptide, or antibody arrays, for example high density 

20 miniaturized arrays or "microarrays," of SARS virus nucleic acid molecules or 

polypeptides, or antibodies capable of specifically binding such nucleic acid molecules 
or pol3^eptides, may be performed. Macroarrays, performed for example by manual 
spotting techniques, may also be used. Arrays generally require a solid support (for 
example, nylon, glass, ceramic, plastic, silicon, nitrocellulose or PVDF membranes, 

25 microwells, microbeads, e.g., magnetic microbeads, etc.) to 

which the nucleic acid molecules or polypeptides or antibodies are attached in a 
specified two-dimensional arrangement, such that the pattern of hybridization is easily 
determinable. Suspension arrays (particles in suspension) that are coded to facilitate 
identification may also be used. SARS virus nucleic acid molecules or polypeptide 

30 probes or targets may be compounds as described herein. 

In some embodiments, high density nucleic acid arrays may for example be 
used to monitor the presence or level of expression of a large number of SARS virus 
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nucleic acid molecules or genes or for detecting or identifying SARS virus nucleic acid 
sequence variations, mutations or polymorphisms. For the purpose of such arrays, 
"nucleic acids" may include any polymer or oligomer of nucleosides or nucleotides 
(polynucleotides or oligonucleotides), which include pyrimidine and purine bases, 
5 preferably cytosine, thymine, and uracil, and adenine and guanine, respectively, or may 
include peptide nucleic acids (PNA). In an alternative aspect, the invention provides 
nucleic acid microarrays including a number of distinct nucleic acid sequence arrays of 
the invention, thus providing specific "sets" of sequences. The number of distinct 
sequences may for example be any integer between 2 and 1 x 10^, such as at least 10^, 

10 10^ 10^ or 10^ 

The invention also provides gene knockout and expression libraries. Thus, 
nucleic acid molecules encoding SARS virus polypeptides or proteins (e.g., PGR 
products of ORF's or total mRNA) may for example be attached to a solid support, 
hybridized with single stranded detectably-labeled cDNAs (corresponding to an 

15 "antisense" orientation), and quantified using an appropriate method such that a signal 
is detected at each location at which hybridization has taken place. The intensity of the 
signal would then reflect the level of gene expression. Comparison of results from 
viruses, for example, of different strains or from different samples or subjects, would 
elucidate differing levels of expression of specified genes. Using similar techniques, 

20 homologous nucleic acids may be identified from different vimses if SARS virus 

nucleic acids are used in the microarray, and probed with nucleic acid molecules from 
different viruses or subjects. In some embodiments, this approach may involve 
constructing his-tagged ORF expression libraries of viral genomes in a bacterial host, 
similar to an expression library in yeast (Martzen M. R. et al., 1999. Science, 

25 286: 1 153). ORF-encoded protein activities may for example be detected in purified his- 
tagged protein pools in cases where activities cannot be detected in extracts or cells. In 
one aspect of the invention, arrayed libraries may be constructed of viral strains each of 
which bears a plasmid expressing a different SARS virus ORF under control of an 
inducible promoter. ORFs are amplified using PGR and cloned into a vector that 

30 enables their expression as N-terminal his-tagged polypeptides. These amplicons are 
also used to construct hybridization microarrays and enable targeted gene disruption, 
reducing expenses. A suitable expression host is selected, and genes encoding 
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particular biochemical activities are identified by screening arrayed pools of his-tagged 
proteins as described previously (Martzen M. R., McCraith S, M., Spinelli S.L., Torres 
F. M., Fields S., Grayhack EJ., and Phizicky E. M., 1999. Science, 286:1 153). 

In some embodiments, protein arrays (including antibody or antigen arrays) 
5 may be used for the analysis and identification of S ARS virus polypeptides or host 

responses to such polypeptides. Thus, protein arrays may be used to detect SARS virus 
polypeptides in a patient; distinguish a SARS virus polypeptide from a host 
polypeptide; detect interactions between SARS virus polypeptides and for example host 
proteins; deterndne the efficacy of potential therapeutics, such as small molecules or 

10 ligands that may bind SARS virus polypeptides; determine protein-antibody 

interactions; and/or detect the interaction of enzyme-substrate interactions. Protein 
arrays may also be used to detect SARS virus antigens and antibodies in samples; to 
profile expression of SARS virus polypeptides; to identify suitable antibodies or map 
epitopes; or for a variety of protein function analyses. 

15 A variety of methods are known for making and using microarrays, as for 

example disclosed in Cheung V, G., et ah, 1999. Nature Genetics Supplement, 21:15- 
19; Lipshutz R. J., et aL,1999, Nature Genetics Supplement, 21 :20-24; Bowtell D. D. 
L., 1999. Nature Genetics Supplement, 21:25-32; Singh-Gasson S., et al, 1999. 
Nature BiotechnoL, 17:974-978; and Schweitzer B., et al, 2002. Nature BiotechnoL, 

20 20:359-365. Thus, for example, microarrays may be designed by synthesizing 

oligonucleotides with sequence variations based on a reference sequences, such as any 
SARS virus sequences described herein. Methods for storing, querying and analyzing 
nricroarray data have for example been disclosed in, for example, United States Patent 
No. 6,484,183; United States Patent No. 6,188,783; and HoUoway A. J., et a/., 2002. 

25 Nature Genetics Supplement, 32:481-489. Protein arrays may be constmcted, detected, 
and analysed using methods known in the art for example mass spectrometric 
techniques, immunoassays such as ELISA and western (dot) blotting combined with for 
example fluorescence detection techniques, and adapted for high throughput analysis, 
as described in for example MacBeath, G. and Schreiber, S.L. Science 2000, 289, 1760- 

30 1763; Levit-Binnun N, et al. (2003) Quantitative detection of protein arrays. Anal 

Chem 75:1436-41; Kukar T, et al. (2002) Protein microarrays to detect protein-protein 
interactions using red and green fluorescent proteins. Anal Biochem 306:50-4; 
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Borrebaeck CA, et al. (2001) Protein chips based on -recombinant antibody fragments: 
a highly sensitive approach as detected by mass spectrometry. Biotechniques 30:1126- 
1 132; Huang RP (2001) Detection of multiple proteins in an antibody-based protein 
microarray system. J Immunol Methods 255:1-13; Emili AQ and Cagney G (2000) 
5 Large-scale functional analysis using peptide or protein arrays. Nature Biotechnol 

18:393-397; Zhu H, et al. (2000) Analysis of yeast protein kinases using protein chips. 
Nature Genet 26:283-9; Lucking A, et al. (1999) Protein Microarrays for Gene 
Expression and Antibody Screening. Anal, Biochem. 270:103-111; or Templin MP, et 
al. (2002) Protein microarray technology. Drug Discov Today 7:815-822. Tools for 
10 microarray techniques are available coimnercially from for example Affymetrix, Santa 
Clara, CA; Nanogen, San Diego, CA; or Sequenom, San Diego, CA. 

Computer Readable Records 

Nucleic acid and polypeptide sequences, as described herein, or a fragment 

15 thereof, may be provided in a variety of media to facilitate access to these sequences 
and enable the use thereof. According, S ARS virus nucleic acid and polypeptide 
sequences of the invention may be recorded or stored on computer readable media, 
using any technique and format that is appropriate for the particular medium. 

In alternative embodiments, the invention provides computer readable media 

20 encoded with a number of distinct nucleic acid or amino acid data sequences of the 

invention. The number of distinct sequences may for example be any integer between 2 
and 1 X 10^, such as at least 10^, 10^, 10"^, or 10^. In one embodiment, the invention 
features a computer medium having a plurality of digitally encoded data records. Each 
data record may include a value representing a nucleic acid or amino acid sequence of 

25 the invention. In some embodiments, the data record may further include values 

representing the level of expression, level or activity of a nucleic acid or amino acid 
sequence of the invention. The data record can be structured as a table, for example, a 
table that is part of a database such as a relational database (for example, a SQL 
database of the Oracle or Sybase database environments). The invention also includes a 

30 method of communicating information about a sample, for example by transmitting 
information, for example transmitting a computer readable record as described herein, 
for example over a computer network. The polypeptide and nucleic acid sequences of 
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the invention, and sequence information pertaining thereto, may be routinely accessed 
by one of ordinary skill in the art for a variety of purposes, including for the purposes 
of comparing substantially identical sequences, etc. Such access may be facilitated 

using publicly available software as described herein. By "computer readable media" 
5 is meant any medium that can be read and accessed directly by a computer. Such media 
include, but are not limited to: magnetic storage media, such as floppy discs, hard disc 
storage medium, and magnetic tape; optical storage media such as CD-ROM; electrical 
storage media such as RAM and ROM; and hybrids of these categories such as 
magnetic/optical storage media. 

10 

Pharmaceutical and Veterinary Compositions. Dosages. And Administration 

Compounds of the invention can be provided alone or in combination with other 
compounds (for example, small molecules, peptides, or peptide analogues), in the 
presence of a liposome, an adjuvant, or any pharmaceutically acceptable carrier, in a 

15 form suitable for administration to humans or to animals. 

Conventional pharmaceutical practice may be employed to provide suitable 
formulations or compositions to administer the compounds to patients suffering from or 
presymptomatic for SARS. Any appropriate route of administration may be employed, 
for example, parenteral, intravenous, subcutaneous, intramuscular, intracranial, 

20 intraorbital, ophthalmic, intraventricular, intracapsular, intraspinal, intracistemal, 
intraperitoneal, intranasal, aerosol, or oral administration. In some embodiments, 
compounds are delivered directly to the lung, by for example, formulations suitable for 
inhalation. In some embodiments, gene therapy techniques may be used for 
administration of SARS virus nucleic acid molecules, for example, as DNA 

25 vaccines .Formulations may be in the form of liquid solutions or suspensions; for oral 
adn[iinistration, formulations may be in the form of tablets or capsules; and for 
intranasal formulations, in the form of powders, nasal drops, or aerosols. 

Methods well known in tlie art for making formulations are found in, for 
example, "Remington's Pharmaceutical Sciences" (18^^^ edition), ed. A. Gennaro, 1990, 

30 Mack Publishing Company, Easton, Pa. Formulations for parenteral administration 

may, for example, contain excipients, sterile water, or saline, polyalkylene glycols such 
as polyethylene glycol, oils of vegetable origin, or hydrogenated napthalenes. 
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Biocompatible, biodegradable lactide polymer, lactide/glycolide copolymer, or 
polyoxyethylene-polyoxypropylene copolymers may be used to control the release of 
the compounds. Other potentially useful parenteral delivery systems for modulatory 
compounds include ethylene- vinyl acetate copolymer particles, osmotic pumps, 
5 implantable infusion systems, and liposomes. Formulations for inhalation may contain 
excipients, for example, lactose, or ma)^ be aqueous solutions containing, for example, 
polyoxyethylene-9-lauryl ether, glycocholate and deoxycholate, or may be oily 
solutions for administration in the form of nasal drops, or as a gel. 

If desired, treatment with a compound according to the invention may be 

10 combined with more traditional therapies for the disease. 

For therapeutic or prophylactic compositions, the compounds are administered 
to an individual in an amount sufficient to stop or slow the replication of the S ARS 
virus, or to confer protective immunity against future SARS vims infection. Amounts 
considered sufficient will vary according to the specific compound used, the mode of 

15 administration, the stage and severity of the disease, the age, sex, and health of the 
individual being treated, and concurrent treatments. As a general rule, however, 
dosages can range from about Ifig to about 100 mg per kg body weight of a patient for 
an initial dosage, with subsequent adjustments depending on the patient's response, 
which can be measured, for example by determining the presence of SARS nucleic acid 

20 molecules, polypeptides, or virions in the patient's peripheral blood. 

In the case of vaccine formulations, an immunogenically effective amount of a 
compound of the invention can be provided, alone or in combination with other 
compounds, with an adjuvant, for example, Freund's incomplete adjuvant or almninum 
hydroxide. The compound may also be linked with a carrier molecule, such as bovine 

25 serum albumin or keyhole limpet hemocyanin to enhance immunogenicity. 

In general, compounds of the invention should be used without causing substantial 
toxicity. Toxicity of the compounds of the invention can be determined using standard 
techniques, for example, by testing in cell cultures or experimental animals and 
determining the therapeutic index, i.e., the ratio between the LD50 (the dose lethal to 

30 50% of the population) and the LDIOO (the dose lethal to 100% of the population). In 
some circumstances however, such as in severe disease conditions, it may be necessary 
to administer substantial excesses of the compositions. 
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Virus Isolation 

Viras isolation was performed on a bronchoaveolar lavage specimen of a fatal 

SARS case belonging to the original case cluster from Toronto, Canada. All work with 
5 the infectious agent was performed in a biosafet}^ level 3 (BSL3) laboratory using a 
NlOO mask for personal protection. Samples were removed from BSL3 after addition 
of the RNA extraction buffer. The virus isolate, named the "Tor2 isolate" was grown 
in African Green Monkey Kidney (Vero E6) cells, the viral particles were purified, and 
the genetic material (RNA) was extracted from the Tor2 isolate (Poutanen, S. M. et al., 

10 N Engl J Med, Apr 10, 2003). More specifically, one hundred microlitre specimens 
were used to inoculate Vero E6 cells (ATCC CRL 1586) on Dulbecco's Modified 
Eagle Medium supplemented with penicillin/ streptomycin, glutamine and 2% fetal calf 
serum. The culture was incubated at 37°C. Cytopathogenic effect was observed 5 days 
post inoculation. The virus was passaged into newly seeded Vero E6 cells which 

15 showed a cytopathogenic effect as early as 2 days post infection (multiplicity of 

infection 10"^). A virus stock was prepared from passage 2 of these cells and preserved 
in liquid nitrogen. The titer of the virus stock was determined to be 1x10 plaque 
forming units (p.f.u.) by plaque assay and 5 x 10^ by tissue culture infectious dose 
(TCID) 50. 

20 For virus propagation, 10 x T-162 flasks of Vero E6 cells were infected with a 

multiplicity of infection of 10"^. When infected cells showed a cytopathognic effect of 
'4+' (48 hours post infection), the cultures were then firozen and thawed to lyse the 
cells, and the supematants were clarified firom cell debris by centrifugation at 10,000 
rpm in a Beckman high-speed centrifuge. The supematants were treated with DNAse 

25 and RNAse for 3 hours at 37°C to remove any cellular genomic nucleic acids and 
subsequently extracted with an equal volume of 1,1,2-trichloro-trifluoroethane. The 
top fraction was ultra-centrifuged through a 5% / 40% glycerol step gradient at 151,000 
X g for 1 hour at 4°C. The virus pellet was resuspended in PBS. RNA was isolated 
using a commercial kit from QIAGEN and stored at -80°C for further use. 

30 



cDNA Librarv Construction 
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The RNA and subsequent products were handled under biosafety level 2 
(BSL2) conditions. The RNA sample was converted to a cDNA library, using a 
combined random-priming and 61igo-dT priming strategy, and resultant subgenomic 
clones were processed under level 1 biosafety conditions. More specifically, purified 
5 viral RNA (55 ng) was used in the construction of a random primed and oligo-dT 
primed cDNA library, using the Superscript Choice System for cDNA synthesis 
(Invitrogen). Linkers 5' -AATTCGCGGCCGCGTCGAC-3', SEQ ID NO: 195, and 
5'-pGTCGACGCGGCCGCG-3', SEQ ID NO: 196, were ligated following cDNA 
synthesis. The cDNA synthesis products were visualized on agarose gels, revealing the 

10 anticipated low-yield smear. To produce sufficient cDNA for cloning, the cDNA 

product was size fractionated on a low-melting point preparative agarose gel, followed 
by PGR amplification using a single PGR primer 5'AATTCGCGGCCGCGTCGAC-3', 
SEQ ID NO: 197, specific to the linkers. This yielded sufficient material for cloning. 
Size-selected cDNA products were cloned and single sequence reads were 

15 generated from each end of the insert from randomly picked clones. A list of the SARS 
virus clones is provided in the accompanying sequence listing, which is incoiporated 
by reference herein (SEQ ID NOs: 92-159, 208 and 209). 

More specifically, size-selected cDNAs were ligated into the pCR4-TOPO TA 
cloning vector (Invitrogen, CA), or after digestion with the restriction nuclease Not I 

20 into the pBR194c vector (The Institute for Genomic Research, Rockville, MD, USA), 
Ligated clones were then transformed by electroporation into DHIOB Tl cells 
(Invitrogen), plated on 22 cm agar plates with the appropriate antibiotic and grown for 
16 hours at 3TC, Colonies were picked into 384- well Axygen culture blocks 
containing 2 X YT media and grown in a shaking incubator for 18 hours at 37°C. Cells 

25 were lysed and DNA purified using standard laboratory procedures. Sequencing 
primers for the 194c clones were 5'-GGCCTCTTCGCTATTACGC-3' (forward 
primer) and 5' TGCAGGTCGACTCTAGAGGAT-3' (reverse primer). 

DNA Sequencing And Assemblv Of Reads 
30 Sequences were assembled and the assembly edited to produce the genomic 

sequence of the SARS virus. More specifically, DNA sequencing of both ends of the 
plasmid templates was achieved using Applied Biosystems BigDye terminator reagent 



wo 2004/096842 



PCT/CA2004/000626 



(version 3), with electrophoresis and data collection on AB 3700 and 3730 XL 
instruments DNA sequence reads were screened for non-viral contaminating 
sequences, trimmed for quality using PHRED (Ewing, B, and P. Green, Genome Res 8, 
186-94, Mar, 1998) and assembled using PHRAP (Gordon, D. et al. Genome Res 8, 
5 195-202, Mar, 1998), Simultaneousl3^ sequences were used in BLAST searches of 
viral nucleotide and non-redundant protein datasets (NCBI, National Library of 
Medicine) to search for similarities. Sequence assemblies were visualized using 
CONSED (Gordon, D. et al. Genome Res 8, 195-202, Mar, 1998). Sequence mis- 
assemblies and contig joins were identified using Miropeats (Parsons, J. D., Comput 
10 Appl Biosci 11, 615-9 (Dec, 1995). As sequence data accrued, the additional sequences 
were assembled until it became apparent that the additional depth of sampling was 
increasing depth of coverage but not extending the length of the contig. At this point, 
3,080 sequencing reads were generated, 2,634 of which were assembled into a single 
large contig. 

15 The sequence information was imported into an ACEDB database (Durbin, J. 

Thierry-Mieg. 1991-. A C. elegans Database. Documentation, code and data available 
from anonymous FTP servers at lirmm.lirmm.fr, cele.mrc-lmb.cam.ac.uk and 
ncbi.nlm.nih.gov) and subjected to biological analysis including the identification of 
open reading frames, detection of similar sequences by BLAST and searching for 

20 apparent firameshifts. When frameshifts were identified by this analysis, the sequence 
assembly was consulted for evidence of sequencing errors and if found, they were 
corrected. The sequences were also searched for any that could extend the 5' end of the 
sequence and these were incorporated when found. High quality sequence 
discrepancies between different sequence reads were identified and resolved. Sequence 

25 reads classified as deleted or chimeric were identified through manual inspection and 
removed from the assembly. The resulting sequence has an average PHRED consensus 
quality score of 89.96. The lowest quality bases in the assembly are in the immediate 
vicinity of the 5' and 3' ends of the viral genome, with the lowest quality base having a 
PHRED score of 35. Most (29,694 of the 29,736 (99.86%)) of the bases have a 

30 consensus score of 90. Almost all regions of the genome are represented by reads 

derived from both strands of the plasmid sequencing templates, the exceptions being 50 
bases at the 5' end represented by a single sequencing read, and 5 bases at the 3' end 
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represented by a single read. The average base in the assembly is represented by 30 
reads in the forward direction and 30 reads in the reverse direction, as determined by 
PHRED. RT-PCR products predicted from the sequence and spanning the entire 
genome yield PGR products of the anticipated size on agarose gels. To confirm the 5' 
5 end of the viral genome RACE was performed using the RLM-RACE kit from Ambion, 
and primers 5'-CAGGAAACAGCTATGACACCAAGAACAAGGCTCTCCA^3' 
(SEQ ID NO: 90) and 5'- 

CAGGAAACAGCTATGACGATAGGGCCTCTTCCACAGA-3' (SEQ ID NO; 91). 
Fourteen clones were recovered and sequenced. Analysis of these sequences confirmed 
10 the 5' end of the coronavirus genome. The SARS genomic sequences have been 
deposited into Genbank (Accession Nos. AY274119.1, AY274119.2, and 
AY274119.3). 



While the invention has been described in connection with specific 
15 embodiments thereof, it will be understood that it is capable of further modifications 
and this application is intended to cover any variations, uses, or adaptations of the 
invention following, in general, the principles of the invention and including such 
departures from the present disclosure that come within known or customary practice 
within the art to which the invention pertains, and may be applied to the essential 
20 features set forth herein and in the scope of the appended claims. 

All patents, patent applications, and publications referred to herein are hereby 
incorporated by reference in their entirety to the same extent as if each individual 
patent, patent application, or publication was specifically and individually indicated to 
be incorporated by reference in its entirety. 
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What is claimed is: 



1. A substantially pure SARS viras nucleic acid molecule. 



5 2. The molecule of claim 1, wherein said molecule is selected from the group 
consisting of genomic RNA or DNA, cDNA, synthetic DNA, or niE^A. 

3. The molecule of claim 1 or 2, wherein said molecule comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID 
10 NOs: 1-13, 15-18, 20-30, 90-159, 208, and 209 or a fragment thereof. 



4. The molecule of claim 3, wherein said molecule comprises a sequence selected 
from the group consisting of SEQ ID NO: 1, SEQ ID NO:2, and SEQ ID NO: 15 or a 
fragment thereof. 

5. The molecule of claim 3, wherein said molecule comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID NO: 
1, SEQ ID NO:2, and SEQ ID NO: 15, or a fragment thereof. 



20 6. The molecule of any one of claims 1 through 3, wherein said molecule 
comprises a s2m motif. 

7. The molecule of claim 6, wherein said s2m motif comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID 
25 NOs: 16, 17, and 18. 



8. The molecule of any one of claims 1 through 3, wherein said molecule 
comprises a leader sequence. 

30 9. The molecule of claim 8, wherein said leader sequence comprises a sequence 
substantially identical to the sequence of SEQ ID NO: 3. 
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10. The molecule of any one of claims 1 through 3, wherein said molecule 
comprises a transcriptional regulatory sequence, 

1 1 . The molecule of claim 10, wherein said transcriptional regulatory sequence 
5 comprises a sequence substantially identical to the sequence selected from the group 

consisting of SEQ ID NOs: 4-13 and 20-30. 

12. The molecule of claim 1, wherein said molecule comprises a sequence 
substantially identical to a sequence selected from nucleotides 265-13,398; 13,398- 

10 21,485; 21,492 - 25,259; 25,268 - 26,092; 25,689 - 26,153; 26,1 17 - 26,347; 26,398 - 
27,063; 27,074 - 27,265; 27,273 - 27,641; 27,638 - 27,772; 27,779 - 27,898; 27,864 - 
28,118; 28,120 - 29,388; 28,130 - 28,426; 28,583 - 28,795; and 29,590 - 29,621 of 
SEQ ID NO: 15. 

15 13. The molecule of any one of claims 1 through 3, wherein said molecule encodes 
a polyprotein. 

14. The molecule of any one of claims 1 through 3, wherein said molecule encodes 
a polypeptide. 



20 



25 



15. A substantially pure SARS virus polypeptide. 

16. The polypeptide of claim 15, wherein said polypeptide comprises a polyprotein. 



17. The polypeptide of claim 15, wherein said polypeptide comprises an identifiable 
signal sequence. 



30 



18. The polypeptide of claim 17, wherein said signal sequence comprises a 
sequence substantially identical to a sequence selected from the group consisting of 
SEQ ID NOs: 76 and 85. 
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19. The polj^eptide of claim 15, wherein said polypeptide comprises a 
transmembrane domain. 

20. The polypeptide of claim 19, wherein said transmembrane domain comprises a 
5 sequence substantially identical to a sequence selected from the group consisting of 

SEQ ID NOs: 77-86. 

21. The polypeptide of claim 15, wherein said polypeptide comprises a 
glycoprotein. 

10 

22. The polypeptide of claim 21, wherein said glycoprotein comprises a matrix 
glycoprotein, 

23. The polypeptide of claim 22, wherein said matrix glycoprotein comprises a 
15 sequence substantially identical to SEQ ID NO: 34. 

24. ' The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of a transmembrane protein and a multitransmembrane protein. 

20 25. The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of a type I transmembrane protein and a type II transmembrane 
protein. 

26. The polypeptide of claim 24, wherein said polypeptide comprises a 
25 transmembrane anchor or a a transmembrane helix. 

27. The polypeptide of any one of claims 1 through 3, wherein said polypeptide 
comprises an epitope of a SARS virus 

28. The polypeptide of claim 15, wherein said polypeptide comprises an ATP- 
30 binding domain. 
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29. The polypeptide of claim 15, wherein said polypeptide comprises a viral 
envelope protein, 

30. The polypeptide of claim 15, wherein said polypeptide comprises a nuclear 
5 localization signal. 

31. The polypeptide of claim 15, wherein said polypeptide comprises a lysine-rich 
sequence. 

10 32. The polypeptide of claim 31, wherein said lysine-rich sequence comprises a 
sequence substantially identical to SEQ ID NO: 14. 

33 The polypeptide of claim 15, wherein said polypeptide comprises a RNA 
binding protein. 

15 

34. The polypeptide of claim 15, wherein said polypeptide comprises a hydrophilic 
domain. 

35. The polypeptide of claim 34, wherein said hydrophilic domain comprises a 
20 sequence substantially identical to SEQ ID NO: 87. 

36. The polypeptide of claim 15, wherein said polypeptide is selected from the 
group consisting of replicase la, replicase lb, spike glycoprotein, small envelope 
protein, matrix glycoprotein, and nucleocapsid protein, 

25 

37. The polypeptide of claim 15, wherein said polypeptide comprises a sequence 
substantially identical to a sequence selected from the group consisting of SEQ ID 
NOs: 14, 33-36, 64-74, and 76-87 or a fragment thereof. 



30 



38. A vector comprising the nucleic acid molecule of claim 1. 
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39. The vector of claim 38, wherein said vector comprises a sequence substantially 
identical to a sequence selected from the group consisting of SEQ ID NOs: 1-13, 15-18, 
20-30, 90-159, 208, and 209. 

5 40. The vector of claim 38, wherein said vector is a gene therapy vector. 

41. A host cell comprising the vector of claim 38. 

42. The host cell of claim 41, wherein said cell is selected from the group consisting 
10 of a mammalian cell, a yeast, a bacterium, and a nematode cell. 

43. A nucleic acid molecule having substantial nucleotide sequence identity to a 
sequence encoding a SARS virus polypeptide or fragment thereof, wherein said 
fragment comprises at least six amino acids, and wherein said nucleic acid molecule 

15 hybridizes under high stringency conditions to at least a portion of a SARS virus 
nucleic acid molecule. 

44. The nucleic acid molecule of claim 43, wherein said nucleic acid molecule has 
100% sequence complementarity to said sequence encoding a SARS virus polypeptide 

20 or fragment thereof. 

45. A nucleic acid molecule having substantial nucleotide sequence identity to a 
SARS virus nucleotide sequence, wherein said nucleic acid molecule comprises at least 
ten nucleotides, and wherein said nucleic acid molecule hybridizes under high 

25 stringency conditions to at least a portion of a SARS virus nucleic acid molecule. 

46. The nucleic acid molecule of claim 45, wherein said nucleic acid molecule has 
100% sequence complementarity to said SARS virus nucleotide sequence. 



30 



47. A nucleic acid molecule comprising a sequence that is antisense to a SARS 
virus nucleic acid molecule 
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48. An antibody that specifically binds to a SARS viras polypeptide. 

49. The antibody of claim 48, wherein said antibody is a neutralizing antibody. 

5 50. A method for detecting a SARS virus virion or polypeptide in a sample, said 
method comprising contacting said sample with the antibody of claim 48, and 
determining whether said antibody specifically binds to said polypeptide. 

51. A method for detecting a SARS virus genome or gene or homolog or fragment 
10 thereof in a sample, said method comprising contacting a SARS virus nucleic acid 

molecule, wherein said nucleic acid molecule comprises at least ten nucleotides, with a 
preparation of DNA from said sample, under hybridization conditions providing 
detection of DNA sequences having nucleotide sequence identity to a SARS virus 
nucleic acid molecule. 

15 

52. The method of claim 31, wherein said nucleic acid molecule comprises at least 
one of a primer pair, wherein said primer pair hybridizes to said a SARS virus genome 
or gene or homolog or fragment thereof under conditions suitable for polymerase chain 
reaction. 

20 

53. A method of targeting a protein for secretion from a cell, said method 
comprising attaching a signal sequence from a SARS virus polypeptide to said protein, 
such that said protein is secreted from said cell. 

25 54. A nucleic acid molecule comprising a sequence complementary to a SARS 
virus nucleotide sequence. 

55. A kit for detecting the presence of a SARS virus nucleic acid molecule or 
polypeptide in a sample, said kit comprising a reagent selected from the group 
30 consisting of a SARS virus nucleic acid molecule and an antibody that specifically 
binds a SARS virus polypeptide. 
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56. A method for eliciting an immune response in an animal, said method 
comprising identifying an animal infected with or at risk for infection with a S ARS 
virus, and administering a S ARS virus polypeptide or fragment thereof, or 
administering a S ARS virus nucleic acid molecule encoding a SARS virus polypeptide 

5 or fragment thereof, to said animal. 

57. The method of claim 56, wherein said administering results in the production of 
an antibody in said animal. 

10 58. The method of claim 56, wherein said administering results in the generation of 
cytotoxic or helper T-lymphocytes in said animal. 

59. A method for treating or preventing a SARS virus infection comprising 
identifying an animal infected with or at risk for infection with a SARS virus, and 

15 administering a SARS vims nucleic acid molecule or polypeptide, or administering a 
compound that inhibits pathogenicity or replication of a SARS virus, to the animal. 

60. The method of claim 59, wherein the animal is a human. 

20 61. Use of a SARS virus nucleic acid molecule or polypeptide for treating or 
preventing a SARS virus infection. 

62. A method of identifying a compound for treating or preventing a SARS virus 
infection, comprising contacting sample comprising a SARS virus nucleic acid 

25 molecule or contacting a SARS viras polypeptide with the compound, wherein an 
increase or decrease in the expression or activity of the nucleic acid molecule or the 
polypeptide identifies a compound for treating or preventing a SARS vims infection. 

63. A vaccine comprising a SARS virus nucleic acid molecule or polypeptide. 

30 

64. The vaccine of claim 62, wherein the vaccine is a DNA vaccine. 



SUBSTITUTE SHEET (RULE 26) 
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65. A microarray comprising a plurality of elements, wherein each element 
comprises one or more distinct nucleic acid or amino acid sequences, and wherein the 
sequences are selected from a SARS virus nucleic acid molecule or polypeptide, or a 
antibody that specifically binds a SARS virus nucleic acid molecule or polypeptide. 

5 

66. A computer readable record comprising distinct SARS virus nucleic acid or 
amino acid sequences. 

67. The computer readable record of claim 65, wherein the computer readable 
10 record comprises a database. 



SUBSTITUTE SHEET (RULE 26) 
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Figure lA 
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Figure IB 
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Nucleocapsid 

SARS 




GROUP I 



Figure IC 
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CTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGT 
AGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAATTTTACTGTCGTTGACA 
AGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCAGCA 
TACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAAC 
ACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGGGACTCTGTGG 
AAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGC 
GTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCACCAATCACGGCCA 
CAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTAC 
TCGTGCCACATGTGGGCGAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAAGGGA 
GCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGCACTGATCCCAT 
TGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCTCA 
ATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGTACCCTCTTGATTGCATC 
AAAGATTTTCTCGCACGCGCGGGCAAGTCAATGTGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAA 
GAGAGGTGTCTACTGCTGCCGTGACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCT 
ACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAATTTGACACTTTCAAAGGGGAATGCCCAAAG 
TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAGACTGAGGGTTT 
CATGGGGCGTAa?ACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGTAACAATATGCACTTGTCTACCT 
TGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAGACGTGCGACTTTCTGAAAGCCACTTGTGAACAT 
TGTGGCACTGAAAATTTAGTTATTGAAGGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAA 
AATGCCATGTCCTGCCTGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACT 
CAAACATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCCTATGTT 
GGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGCTGATATTGGCTCAGGCCATACTGGCAT 
TACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTGTTAACATTA 
ACATTGTTGGCGATTTTCATTTGAATGAAGAGGTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAAGT 
GCCTTTATTGACACTATAAAGAGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTA 
TAAAGTTACCAAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCAC 
TGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTTGATGCAGCAAAC 
CACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATTACGTCT 
TGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCAACAGTGTCATTATTATGGCATATGTAACTGGTG 
GTCTTGTACAACAGACTTCTCAGTGGTTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATC 
TTTGAATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATT 
TCTCATTACAGGTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAGGATTGTG 
TAAAATGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAAGTCACTATCGCTGGCGCA 
AAGTTGCGATCACTCAACTa?AGGTGAAGTCTTCATCGCTCAAAGCAAGGGACTTTACCGTCAGTGTATACG 
TGGCAAGGAGCAGCTGCAACTACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATT 
CACATGACACAGTACTTACCTCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCC 
GTTGATAGCTTCACAAATGGAGCTATCGTCGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAGAT 
TAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTCTTTCGCTTAAAAG 
GGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGGGAAGTTCAAGGTTACAAGAATGTG 
AGAATCACATTTGAGCTTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGA 
ATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTT 
CTGATCTCCTTACCAACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGAT 
GCTGGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAAGAGGACGA 
TGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATGAGTACGGTACAGAGGATGATTATCAAG 
GTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGAAGACTGGCTG 
GATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAGAACCTACACCTGAAGAACCAGTTAATCAGTT 
TACTGGTTATTTAAAACTTACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTG 
CTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCACTC 
AACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGT 
AGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTGTTGGACCTAACCTAA 
ATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCATATGAAAATTTCAATTCACAGGACATCTTACTTGCA 
CCATTGTTGTCAGCAGGCATATTTGGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCG 
TACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACC 
TGAAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACTGAGGAGAAA 
TCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATTGATGAGGTTACCACAACACT 
GGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTTGCTGATATCAATGGTAAGCTTTACCATGATT 
CTCAGl^CATGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATGTT 



FIGURE 3A 
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ATCACTAGTGGTGATATCACTTGTGTTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTC 
AAGAGCTTTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGTTATA 
CACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATTTTATGTACTACCTTCAGAAGCACCT 
AATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGAC 
AAGAAAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCAACGTAAGTATAAAGG2VA 
TTAAAATTCAAGAGGGCATCGTTGACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCT 
TCTATTATTACGAAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGG 
TTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCAGTATCATCAC 
CAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACATCTGAGGAGCACTTTGTAGAA 
ACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGCGTACAGAGTTAGGTGTTGAATT 
TCTTAAGCGTGGTGACAAAATTGTGTACCACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGG 
TTCTTTCACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACT 
GTGGACAACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGTCCAAC 
ATACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGTAAGACTTTCTTTGTAC 
TACCTAGTGATGACACACTACGTAGTGAAGGTTTCGAGTACTACCATACTCTTGATGAGAGTTTTCTTGGT 
AGGTACATGTCTGCTTTAAACCACACAAAGAAATGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAA 
ATGGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATG 
CACCAGCACTTCAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC 
GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTTCTACAGCATGC 
TAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTGTAAACATTGTGGTCAGAAAACTACTACCTTAA 
CGGGTGTAGAAGCTGTGATGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATTCCA 
TGTGTGTGTGGTCGTGATGCTACACAATATCTAGTACAACAAGAGTCTTCTTTTGTTATGATGTCTGCACC 
ACCTGCTGAGTATAAATTACAGCAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTG 
GTCATTACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCACCTTACAAAGATGTCA 
GAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACAACCATCAAGCCTGTGTC 
GTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAATTGGATGGGTATTATAAAAAGGATAATG 
CTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTC 
AAACTCACATGTTCTAACACAAAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTC 
ACGAGAGCTATCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTATT 
CAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAACCAGGCTACAACC 
AAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGTACAAAGCCAGTAGATACTTCAAA 
TTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGAATGGACAATCTTGCTTGTGAAAGTCAACAACCCA 
CCTCTGAAGAAGTAGTGGAAAATCCTACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAA 
GTTGTAGGCAATGTCATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGA 
TCTTATGGCTGCTTATGTGGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTAGCCTTAG 
GTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGGAGTAAAATTTTGGCTTAT 
GTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAATTGCGCTAAGAGATTAGCACAACGTGTGTT 
TAACAATTATATGCCTTATGTGTTTACATTATTGTTCCAATTGTGTACTTTTACTAAAAGTACCAATTCTA 
GAATTAGAGCTTCACTACCTACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGAT 
GCCGGCATTAATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTGTT 
AAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCTAATTTTGGTGCTC 
CTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAACGTTACTACTATGGATTTCTGTGAA 
GGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCTTATCCAGCTCTTGAAACCAT 
TCAGGTGACGATTTCATCGTACAAGCTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCAT 
ATATGTTGTTCACAAAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTT 
GCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCACCCGTTTC 
TGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGG 
ATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCACACGCGTTGAGTGTACAACT 
ATTGTTAATGGCATGAAGAGATCTTTCTATGTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAA 
TTGGAATTGTCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATGAAGTTGCTCGTGATT 
TGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCTGTG 
AAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCTCCCA 
TTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCACTGCCTATTAATGTCATAGTTTTTG 
ATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAGTCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAA 
CCTATTCTGTTGCTTGACCAAGCTCTTGTATCAGACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTT 
TGATGCTTATGTCGACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTA 
CAGCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCAGCTGCCCGA 
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CAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTCAAACTTTCACATCACTCTGA 
CTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTCACCTATAATAAGGTTGAAAACATGACGCCCA 
GAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTT 
TCACTCATCTGGAATGTAAAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGC 
CAAGAAGAACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACTACTA 
AAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAGGCCACATTATTGTGC 
GTTCTTGCTGCATTGGTTTGTTATATCGTTATGCCAGTACATACATTGTCAATCCATGATGGTTACACAAA 
TGAAATCATTGGTTACAAAGCCATTCAGGATGGTGTCACTCGTGACATCATTTCTACTGATGATTGTTTTG 
CAAATAAACATGCTGGTTTTGACGCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGC 
CCTGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAG 
AGCAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTACACAC 
CTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTTGCTGCTGAGTGTACAATTTTT 
AAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGACACTAATTTGCTAGAGGGTTCTATTTCTTATAG 
TGAGCTTCGTCCAGACACTCGTTATGTGCTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACCTGG 
AGGGTTCTGTTAGAGTAGTAACAACTTTTGATGCTGAGTACTGTAGACATGGTACATGCGAAAGGTCAGAA 
GTAGGTATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGT 
TTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTGCAACCTGTGGGTGCTT 
TAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATATTGGTGACTTGTGCTGCCTACTACTTT 
ATGAAATTCAGACGTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTC 
TTTCACTATACTCTGTCTGGTACCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACT 
TGACATTCTATTTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT 
GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGGTTCTTTAACAA 
CTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTCGAGGAGGCTGCTTTGTGTACCT 
TTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTACACAGTATAACAGG 
TATCTTGCTCTATATAACAAGTACAAGTATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGC 
TTGCTGCCACTTAGCAAAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCAC- 
AGACATCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCAGGCAAAGTTGAA 
GGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTGGATGACACAGTATACTG 
TCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAGATCTGCTCATTCGCAAAT 
CCAACCATAGCTTTCTTGTTCAGGCTGGCAATGTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGT 
CTGCTTAGGCTTAAAGTTGATACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGG 
TCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCTA 
ATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATTGATTATGATTGC 
GTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACACGCTGGTACTGACTTAGAAGGTAA 
ATTCTATGGTCCATTTGTTGACAGACAAACTGCACAGGCTGCAGGTACAGACACAACCATAACATTAAATG 
TTTTGGCATGGCTGTATGCTGCTGTTATCAATGGTGATAGGTGGTTTCTTAATAGATTCACCACTACTTTG 
AATGACTTTAACCTTGTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATTGGG 
ACCTCTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTGCAGAATG 
GTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACACCATTTGATGTTGTTAGA 
CAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTTTT 
AACTTTCTTGACATCACTATTGATTCTTGTTCAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGA 
ATGCTTTCTTGCCATTTACTCTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAG 
CACGCATTCTTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATATGGTCTACATGCC 
TGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAATTGGCTGACACTAGCTTGTCTGGTTATAGGCTTA 
AGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGATGAT 
GCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTTACAAAGTCTACTATGGTAATGCTTT 
AGATCAAGCTATTTCCATGTGGGCCTTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTA 
TCATGTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAAC 
ACCTTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGCCTTTTCTG 
TTTACTCAACCGTTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAGAATTTAGGT 
ATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATTGATGCTTTCAAGCTTAACATTAAGTTGTTG 
GGTATTGGAGGTAAACCATGTATCAAGGTTGCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACATC 
TGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTAC 
AACTCCACAATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCT 
GTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTCGATAACCGTGCTAC 
TCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCCGCTTATGCCACTGCCCAGGAGGCCT 
ATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTCGTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCT 
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AAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCA?^GTTGGAAAAGATGGCAGATCAGGCTATGACCCA 
AATGTACAAACAGGCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCA 
CTATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGTTGTGTTCCA 
CTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAA 
CACTTGTGATGGTAACACCTTTACATATGCATCTGCACTCTGGGAAATCCAGCAAGTTGTTGATGCGGATA 
GCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACA 
GCTCTAAGAGCCAACTCAGCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTC 
CTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTCGAAGG 
GAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGATTCCCTAAGAGTGAT 
GGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACACACCAAAAGGGCCTAA 
AGTGA?yiTACTTGTACTTCATCAAAGGCTTAAACAACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTG 
CTACAGTACGTCTTCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCT 
TTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGT 
GAAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCAAG 
AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGACCATCCAAATCCTAAAGGATTC 
TGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACTTGTGCTAATGACCCAGTGGGTTTTACACTTAG 
AAACACAGTCTGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCT 
TGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGT 
GCGGCACAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTGGTTTT 
GCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCAATTTATTAGACTCTTA 
CTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAGAGACTATTTATAACTTGGTTAAAGATT 
GTCCAGCGGTTGCTGTCCATGACTTTTTCAAGTTTAGAGTAGATGGTGACATGGTACCACATATATCACGT 
CAGCGTCTAACTAAATACACAATGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGA 
TACATTAAAAGAAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG 
ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCCAATCATTATTA 
AAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAGGA 
TCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACCAGGCTGCGGAGTTCCTATTGTGG 
ATTCATATTACTCATTGCTGATGCCCATCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGAT 
GCTGATCTCGCAAAACCACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCT 
CTTCGACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATGATAGGT 
GTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTACAAGTTTTGGACCACTA 
GTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAACTGGATACCATTTTCGTGAGTTAGGAGT 
CGTACATAATCAGGATGTAAACTTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTG 
ATCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCA 
CTAACAAACAATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGT 
GTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTCAGGATGGCAACG 
CTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGTGTGATATCAGACAACTCCTATTC 
GTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAATCGT 
TAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAA 
TGAGTTATGAGGATCAAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATG 
AATCTTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTAGTACTAT 
GACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATTG 
GAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAACTGTTTACAGTGATGTAGAAACTCCACAC 
CTTATGGGTTGGGATTATCCAAAATGTGACAGAGCCATGCCTAACATGCTTAGGATAATGGCCTCTCTTGT 
TCTTGCTCGCAAACATAACACTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGC 
AAGTATTAAGTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTGAT 
GCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATGTAAATGCACTTCT 
TTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTACAACACAGGCTCTATGAGTGTCTCT 
ATAGAAATAGGGATGTTGATCATGAATTCGTGGATGAGTTTTACGCTTACCTGCGTAAACATTTCTCCATG 
ATGATTCTTTCTGATGATGCCGTTGTGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCAT 
TAAGAACTTTAAGGCAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGA 
CTGACCTTACTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAGATGATTAC 
GTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTGTCGATGATATTGTCAAAAC 
AGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTA 
ATCAGGAGTATGCTGATGTCTTTCACTTGTATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGC 
CACATGTTGGACATGTATTCCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTA 
TGAGGCTATGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGACTT 
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CACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATGACCATGTCATTTCA 
ACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATGCCCCAGGTTGTGATGTCACTGATGT 
GACACAACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCCATTAGTTTTCCATTAT 
GTGCTAATGGTCAGGTTTTTGGTTTATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAAT 
GCGATAGCAACATGTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAA 
GCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTGCCACTGTAC 
GCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACCATTGAACAGA 
AACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTAAAGTACAGATTGGAGAGTACACCTTTGAAAA 
AGGTGACTATGGTGATGCTGTTGTGTACAGAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTG 
TGTTGACATCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATT 
ACTGGCTTGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAA2VAGGTCGG 
CATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACTTGCTC 
TCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATGCAGCTGTTGATGCCCTATGTGAAAAG 
GCATTAAAATATTTGCCCATAGATAAATGTAGTAGAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGA 
TAAATTCAAAGTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTG 
CTGACATTGTAGTCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTT 
CGTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGCTGACTAAAGG 
CACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAATAGGTCCAGACATGTTCCTTG 
GAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTAGTTTATGACAATAAGCTAAAA 
GCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGTTCTACAAAGGTGTTATTACACATGATGTTTCATC 
TGCAATCAACAGACCTCAAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTG 
TTTTTATCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGACTGTT 
GATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAACAGCACACTCTTGTAA 
TGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGATAGAGATC 
TTTATGACAAACTGCAATTTACAAGTCTAGAAATACCACGTCGCAATGTGGCTACATTACAAGCAGAAAAT 
GTAACTGGACTTTTTAAGGACTGTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCT 
CAGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGGACATGACCT 
ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTAATATGTTTATC 
ACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTGTCATGCAACTAG 
AGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGATTTTCTACAGGTGTTAACTTAGTAGCTGTACCGA 
CTGGTTATGTTGACACTGAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAG 
TTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAAT 
GCTCAGTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTA 
CATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTGACAAACGTGCAACTTGC 
TTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTGTGGGTTTTGACTATGTCTATAACCCATT 
TATGATTGATGTTCAGCAGTGGGGCTTTACGGGTAACCTTCAGAGTAACCATGACCAACATTGCCAGGTAC 
ATGGAAATGCACATGTGGCTAGTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCTTTGTT 
AAGCGCGTTGATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAAA 
AGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATGACATTGGAAATC 
CAAAGGCTATCAAGTGTGTGCCTCAGGCTGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGAC 
AAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTCACTGATGGTGTTTG 
TTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCT 
TGTCAAACTTGAACTTACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCA 
GCTTTCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTCCTTGTGA 
GTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTGCTACGTGTATTACACGAT 
GCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTACTTGGATGCATATAATATG 
ATGATTTCTGCTGGATTTAGCCTATGGATTTACAAACAATTTGATACTTATAACCTGTGGAATACATTTAC 
CAGGTTACAGAGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCG 
AAGCACCTGTTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTGAA 
AATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTAAACCAGTGCCAGA 
GATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTGTAATCTGGGACTACAAAAGAGAAG 
CCCCAGCACATGTATCTACAATAGGTGTCTGCACAATGACTGACATTGCCAAGA2!^CCTACTGAGAGTGCT 
TGTTCTTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAA 
TGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCA 
ATGGAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACGGCATTATT 
CAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTAAGCCCAGATCACAAATGGA 
AACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGAAC 
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ACATCGTTTATGGAGATTTCAGTCATGGACAACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGC 
TCACAAGATTCACCACTTAAATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAAC 
AGATGCGCAAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCGAGA 
TAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACTATGCTGAAATTTCA 
TTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCGAGCGTGGCA 
ACCAGGTGTTGCGATGCCTAACTTGTACAAGATGCAAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATT 
ATGGTGAAAATGCTGTTATACCAAAAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATAC 
TTAAATACACTTACTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGG 
AGTTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATTCAGATCTTA 
ATGACTTCGTCTCCGACGCATATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGCTAATAAATGG 
GACCTTATTATTAGCGATATGTATGACCCTAGGACCAAACATGTGACAAAAGAGAATGACTCTAAAGAAGG 
GTTTTTCACTTATCTGTGTGGATTTATAAAGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAA 
CAGAGCATTCTTGGAATGCTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTTACA 
AATGTAAATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAACAAAT 
TGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCCAGTTGTCTTCCTATT 
CACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTGCTGTAATGTCTCTTAAGGAGAATCAA 
ATCAATGATATGATTTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGT 
TTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTG 
GTAGTGACCTTGACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCT 
ATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGGATTTATTTCT 
TCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTGGCAACCCTGTCATACCTTTTA 
AGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTTGGTTCTACCATG 
AACAACAAGTCACAGTCGGTGATTATTATTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGA 
ATTGTGTGACAACCCTTTCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTCGATA 
ATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAGGTAAT 
TTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTCTCTATGTTTATAAGGGCTATCAACC 
TATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGAAACCTATTTTTAAGTTGCCTCTTGGTA 
TTAACATTACAAATTTTAGAGCCATTCTTACAGCCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCT 
GCAGCCTATTTTGTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACAATCAC 
AGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAGAGCTTTGAGATTGACA 
AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTACA 
AACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTGTCTATGCATGGGAGAGAAAAAA 
AATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATG 
GCGTTTCTGCCACTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGA 
GATGATGTAAGACAAATAGCGCCAGGACAAACTGGTGTTATTGCTGATTATAATTATAAATTGCCAGATGA 
TTTCATGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATAATTATA 
AATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTAATGTGCCTTTCTCCCCT 
GATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCAC 
TACTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGG 
'TTTGTGGACCAA^TTATCCACTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTxAATGGACTCACT 
GGTACTGGTGTGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTGA 
TTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCGCTTTTGGGGGTG 
TAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGCACT 
GATGTTTCTACAGCAATTCATGCAGATCAACTCACACCAGCTTGGCGCATATATTCTACTGGAAACAATGT 
ATTCCAGACTCAAGCAGGCTGTCTTATAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTA 
TTGGAGCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTG 
GCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATACCTACTAA 
CTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCTCCGTAGATTGTAATATGT 
ACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCCAATATGGTAGCTTTTGCACACAACTAAAT 
CGTGCACTCTCAGGTATTGCTGCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAAT 
GTACAAAACCCCAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGC 
CAACTAAGAGGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAG 
CAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGTTCAATGGACTTAC 
AGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTGCTGCTCTAGTTAGTGGTACTGCCA 
CTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAATGGCATATAGGTTC 
AATGGCATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGC 
GATTAGTCAAATTCAAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACC 
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AGAATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAATTTCAAGTGTGCTA 
AATGATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGACAGGTTAATTACAGGCAGACT 
TCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCTGCTAATCTTG 
CTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCAC 
CTTATGTCCTTCCCACAAGCAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGA 
GAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTTTTG 
TGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAATTACTACAGACAAT 
ACATTTGTCTCAGGA?U^TTGTGATGTCGTTATTGGCATCATTAACAACACAGTTTATGATCCTCTGCAACC 
TGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATGTTGATCTTG 
GCGACATTTCAGGCATTAACGCTTCTGTCGTCAACATTCAA?^AAGAAATTGACCGCCTCAATGAGGTCGCT 
AAAAATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTG 
GTATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGA 
CTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACTCT 
GAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAACGAACTTATGGATTTGTTTATGAGATTTTTT 
ACTCTTGGATCAATTACTGCACAGCCAGTAAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTAC 
AGCAACGATACCGCTACAAGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTGTTT 
TTCAGAGCGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCAGTTC 
ATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGCTGCAGGTATGGAGGC 
GCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGAT 
GTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACTACTTTGTTTGCTGGCAC 
ACACATAACTATGACTACTGTATACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGG 
CATTTCAACACCAAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTCAGGTGTTA 
AAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACAAATTACTACA 
GACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAA.T 
ACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGCAATGGATCCAATTTATGATGAGCCGACGACGA 
CTACTAGCGTGCCTTTGTAAGCACAAGAAAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACA 
GGTACGTTAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCAT 
CCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAACGGTTT 
ACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCTGGTCTAAACGAACTAA 
CTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATGGCAGACAACGGTACTATTACCGTTGAG 
GAGCTTAAACAACTCCTGGAACAATGGAACCTAGTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACT 
ACAATTTGCCTATTCTAATCGGAACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGC 
CAGTAACACTTGCTTGTTTTGTGCTTGCTGCTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT 
GCAA.TGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTGTTTGCTCGTAC 
CCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTGCCTCTCCGGGGGACAATTGTGA 
CCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACTTGCGAATGGCCGGA 
CACTCCCTAGGGCGCTGTGACATTAAGGACCTGCCAA2^GAGATCACTGTGGCTACATCACGAACGCTTTC 
TTATTACAAATTAGGAGCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTA 
TTGGAAACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAGTAAGTG 
ACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGATTATCATTATGAGGACTT 
TCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGCCTCTAACT 
AAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAA 
TTATTCTCTTCCTGACATTGATTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 
ACGACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCTTGC 
TGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCTTGTGCTGACGGTACTCGACATA 
CCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGTTCAACAAGAGCTC 
TACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTATTTTTAATACTTTGCTTCACCATTAAGAGAAAGAC 
AGAATGAATGAGCTCACTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAAT 
AATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGA 
ACATGAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACAGCGCTGT 
GCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGGGTAATACTTATAGCACTG 
CTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGATGGCACACTATGGTTCAAACATGCACACCT 
AATGTTACTATCAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGG 
TCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAAT 
GGACCCCAATCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAATAA 
CCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCCAATAATACTGCGT 
CTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATC 
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AACACCAATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGA 
CGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTC 
CCTACGGCGCTAACAAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCAC 
ATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACATTGCCAAA 
AGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTA 
ATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAATGGCTAGCGGAGGTGGTGAA 
ACTGCCCTCGCGCTATTGCTGCTAGACAGATTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACA 
ACAACAAGGCCAAACTGTCACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTG 
CCACAAAACAGTACAACGTCACTCAAGCATTTGGGAGACGTGGTCCAGAACAAACCCAAGGAAATTTCGGG 
GACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAATTTGCTCCAAGTGC 
CTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCTTCGGGAACATGGCTGACTTATCATG 
GAGCCATTAAATTGGATGACAAAGATCCACAATTCAAAGACAACGTCATACTGCTGAACAAGCACATTGAC 
GCATACAAAACATTCCCACCAACAGAGCCTAAAAAGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTT 
GCCGCAGAGACAAAAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGAC 
AACTTCAAAATTCCAT6AGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATGACCACACAA 
GGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTCTACTCTTGTGCAGAATGAAT 
TCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTAATCTCACATAGCAATCTTTAATCAATGTGTA 
ACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGT 
GAATAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCC 
CCATGTCATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAA 

GenBank Accession No. AY2741 19.1; SEQ ID NO: 1 



FIGURE 3H 



wo 2004/096842 



PCT/CA2004/000626 



14/55 

CTACCCAGGAAAAGCCAACCAACCTCGATCTCTTGTAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGT 
AGCTGTCGCTCGGCTGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAA.TTTTACTGTCGTTGACA 
AGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTTTCGTCCGTGTTGCAGTCGATCATCAGCA 
TACCTAGGTTTCGTCCGGGTGTGACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAAC 
ACACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCGTGGCTTCGGGGACTCTGTGG 
AAGAGGCCCTATCGGAGGCACGTGAACACCTCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGC 
GTACTGCCCCAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCACCAATCACGGCCA 
CAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTCAGTACGGTCGTAGCGGTATAACACTGGGAGTAC 
TCGTGCCACATGTGGGCGAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAAGGGA 
GCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAGGTGACGAGCTTGGCACTGATCCCAT 
TGAAGATTATGAACAAAACTGGAACACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCTCA 
ATGGAGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGTACCCTCTTGATTGCATC 
AAAGATTTTCTCGCACGCGCGGGCAAGTCAATGTGCACTCTTTCCGAA.CAACTTGATTACATCGAGTCGAA 
GAGAGGTGTCTACTGCTGCCGTGACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTGATAAGAGCT 
ACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAATTTGACACTTTCAAAGGGGAATGCCCAAAG 
TTTGTGTTTCCTCTTAACTCAAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAGACTGAGGGTTT 
CATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGTAACAATATGCACTTGTCTACCT 
TGATGAAATGTAATCATTGCGATGAAGTTTCATGGCAGACGTGCGACTTTCTGAAAGCCACTTGTGAACAT 
TGTGGCACTGAAAATTTAGTTATTGAAGGACCTACTACATGTGGGTACCTACCTACTAATGCTGTAGTGAA 
AATGCCATGTCCTGCCTGTCAAGACCCAGAGATTGGACCTGAGCATAGTGTTGCAGATTATCACAACCACT 
CAAACATTGAAACTCGACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCCTATGTT 
GGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGCTGATATTGGCTCAGGCCATACTGGCAT 
TACTGGTGACAATGTGGAGACCTTGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTGTTAACATTA 
ACATTGTTGGCGATTTTCATTTGAATGAAGAGGTTGCCATCATTTTGGCATCTTTCTCTGCTTCTACAAGT 
GCCTTTATTGACACTATAAAGAGTCTTGATTACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTA 
TAAAGTTACCAAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGTTTTAACACCAC 
TGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGATCAATTTTTGCGCGCACACTTGATGCAGCAAAC 
CACTCAATTCCTGATTTGCAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATTACGTCT 
TGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCAACAGTGTCATTATTATGGCATATGTAACTGGTG 
GTCTTGTACAACAGACTTCTCAGTGGTTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATC 
TTTGAATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTTGGGAGATTCTCAAATT 
TCTCATTACAGGTGTTTTTGACATCGTCAAGGGTCAAATACAGGTTGCTTCAGATAACATCAAGGATTGTG 
TAAA?VTGCTTCATTGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAAGTCACTATCGCTGGCGCA 
AAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAAAGCAAGGGACTTTACCGTCAGTGTATACG 
TGGCAAGGAGCAGCTGCAACTACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATT 
CACATGACACAGTACTTACCTCTGAGGAGGTTGTTCTCAAGAACGGTGAACTCGAAGCACTCGAGACGCCC 
GTTGATAGCTTCACAAATGGAGCTATCGTTGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAGAT 
TAAGGACAAAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTCTTTCGCTTAAAAG 
GGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGATACTGTTTGGGAAGTTCAAGGTTACA?VGAATGTG 
AGAATCACATTTGAGCTTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACTGTTGA 
ATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGCTGTTGTGAAGACTTTACAACCAGTTT 
CTGATCTCCTTACCAACATGGGTATTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGAT 
GCTGGTGAAGA?^AACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGATGAGGAAGAAGAGGACGA 
TGCAGAGTGTGAGGAAGAAGAAATTGATGAAACCTGTGAACATGAGTACGGTACAGAGGATGATTATCAAG 
GTCTCCCTCTGGAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGAAGACTGGCTG 
GATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAGAACCTACACCTGAAGAACCAGTTAATCAGTT 
TACTGGTTATTTAAAACTTACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAGTG 
CTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATGGTGGTGGTGTAGCAGGTGCACTC 
AACAAGGCAACCAATGGTGCCATGCAAAAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGT 
AGGAGGGTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTGTTGGACCTAACCTAA 
ATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCATATGAAAATTTCAATTCACAGGACATCTTACTTGCA 
CCATTGTTGTCAGCAGGCATATTTGGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGACGGTTCG 
TACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAGCAGGTTGTCATGGATTATCTTGATAACC 
TGAAGCCTAGAGTGGAAGCACCTAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACTGAGGAGAAA 
TCTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATTGATGAGGTTACCACAACACT 
GGAAGAAACTAAGTTTCTTACCAATAAGTTACTCTTGTTTGCTGATATCAATGGTAAGCTTTACCATGATT 
CTCAGAACATGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATGCACCTTACATGGTAGGTGATGTT 
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ATCACTAGTGGTGATATCACTTGTGTTGTAATACCCTCCAAAAAGGCTGGTGGCACTACTGAGATGCTCTC 
AAGAGCTTTGAAGAAAGTGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGTTATA 
CACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATTTTATGTACTACCTTCAGAAGCACCT 
AATGCTAAGGAAGAGATTCTAGGAACTGTATCCTGGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGAC 
AAGAAAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCAACGTAAGTATAAAGGAA 
TTAAAATTCAAGAGGGCATCGTTGACTATGGTGTCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCT 
TCTATTATTACGAAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGTGACACATGG 
TTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTAAAGCTCCTGCCGTAGTGTCAGTATCATCAC 
CAGATGCTGTTACTACATATAATGGATACCTCACTTCGTCATCAAAGACATCTGAGGAGCACTTTGTAGAA 
ACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGCGTACAGAGTTAGGTGTTGAATT 
TCTTAAGCGTGGTGACAAAATTGTGTACCACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGG 
TTCTTTCACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAAAAGTGTTCACAACT 
GTGGACAACACTAATCTCCACACACAGCTTGTGGATATGTCTATGACATATGGACAGCAGTTTGGTCCAAC 
ATACTTGGATGGTGCTGATGTTACAAAAATTAAACCTCATGTAAATCATGAGGGTAAGACTTTCTTTGTAC 
TACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTACCATACTCTTGATGAGAGTTTTCTTGGT 
AGGTACATGTCTGCTTTAAACCACACAAAGAAATGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAA 
ATGGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAGCTTGAAGTCAAATTCAATG 
CACCAGCACTTCAAGAGGCTTATTATAGAGCCCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTC 
GCTTACAGTAATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTTCTACAGCATGC 
TAATTTGGAATCTGCAAAGCGAGTTCTTAATGTGGTGTGTAAACATTGTGGTCAGAAAACTACTACCTTAA 
CGGGTGTAGAAGCTGTGATGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATTCCA 
TGTGTGTGTGGTCGTGATGCTACACAATATCTAGTACAACAAGAGTCTTCTTTTGTTATGATGTCTGCACC 
ACCTGCTGAGTATAAATTACAGCAAGGTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTG 
GTCATTACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCACCTTACAAAGATGTCA 
GAGTACAAAGGACCAGTGACTGATGTTTTCTACAAGGAAACATCTTACACTACAACCATCAAGCCTGTGTC 
GTATAAACTCGATGGAGTTACTTACACAGAGATTGAACCAAAATTGGATGGGTATTATAAAAAGGATAATG 
CTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAACCATTACCAAATGCGAGTTTTGATAATTTC 
AAACTCACATGTTCTAACACAAAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTC 
ACGAGAGCTATCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGGCTATTGACTATAGACACTATT 
CAGCGAGTTTCAAGAAAGGTGCTAAATTACTGCATAAGCCAATTGTTTGGCACATTAACCAGGCTACAACC 
AAGACAACGTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGTACAAAGCCAGTAGATACTTCAAA 
TTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGAATGGACAATCTTGCTTGTGAAAGTCAACAACCCA 
CCTCTGAAGAAGTAGTGGAAAATCCTACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTACCGAA 
GTTGTAGGCAATGTCATACTTAAACCATCAGATGAAGGTGTTAAAGTAACACAAGAGTTAGGTCATGAGGA 
TCTTATGGCTGCTTATGTGGAAAACACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTAGCCTTAG 
GTTTAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGGAGTAAAATTTTGGCTTAT 
GTCAAACCATTCTTAGGACAAGCAGCAATTACAACATCAAATTGCGCTAAGAGATTAGCACAACGTGTGTT 
TAACAATTATATGCCTTATGTGTTTACATTATTGTTCCAATTGTGTACTTTTACTAAAAGTACCAATTCTA 
GAATTAGAGCTTCACTACCTACAACTATTGCTAAAAATAGTGTTAAGAGTGTTGCTAAATTATGTTTGGAT 
GCCGGCATTAATTATGTGAAGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTGTT 
AAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGTACTCTTATCTAATTTTGGTGCTC 
CTTCTTATTGTAATGGCGTTAGAGAATTGTATCTTAATTCGTCTAACGTTACTACTATGGATTTCTGTGAA 
GGTTCTTTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCTTATCCAGCTCTTGAAACCAT 
TCAGGTGACGATTTCATCGTACAAGCTAGACTTGACAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCAT 
ATATGTTGTTCACAAAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGGCTATTTT 
GCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCATTAGTATTGTACAAATGGCACCCGTTTC 
TGCAATGGTTAGGATGTACATCTTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGG 
ATGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCACACGCGTTGAGTGTACAACT 
ATTGTTAATGGCATGAAGAGATCTTTCTATGTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAA 
TTGGAATTGTCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATGAAGTTGCTCGTGATT 
TGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGACCAGTCATCGTATATTGTTGATAGTGTTGCTGTG 
AAAAATGGCGCGCTTCACCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCTCCCA 
TTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCACTGCCTATTAATGTCATAGTTTTTG 
ATGGCAAGTCCAAATGCGACGAGTCTGCTTCTAAGTCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAA 
CCTATTCTGTTGCTTGACCAAGCTCTTGTATCAGACGTTGGAGATAGTACTGAAGTTTCCGTTAAGATGTT 
TGATGCTTATGTCGACACCTTTTCAGCAACTTTTAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTA 
CAGCTCACAGCGAGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCAGCTGCCCGA 
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CAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTATTGAATGTCTCAAACTTTCACATCACTCTGA 
CTTAGAAGTGACAGGTGACAGTTGTAACAATTTCATGCTCACCTATAATAAGGTTGAAAACATGACGCCCA 
GAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCAAGTAGCAAAAAGTCACAATGTT 
TCACTCATCTGGAATGTAAAAGACTACATGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGC 
CAAGAAGAACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAATGTCATAACTACTA 
AAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTTGTTTTAAACTTATGCTTAAGGCCACATTATTGTGC 
GTTCTTGCTGCATTGGTTTGTTATATCGTTATGCCAGTACATACATTGTCAATCCATGATGGTTACACAAA 
TGAAATCATTGGTTACAAAGCCATTCAGGATGGTGTCACTCGTGACATCATTTCTACTGATGATTGTTTTG 
CAAATAAACATGCTGGTTTTGACGCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGC 
CCTGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCTTACCGGGTACTGTGCTGAG 
AGCAATCAATGGTGACTTCTTGCATTTTCTACCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTACACAC 
CTTCCAAACTCATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTTGCTGCTGAGTGTACAATTTTT 
AAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGACACTAATTTGCTAGAGGGTTCTATTTCTTATAG 
TGAGCTTCGTCCAGACACTCGTTATGTGCTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACCTGG 
AGGGTTCTGTTAGAGTAGTAACAACTTTTGATGCTGAGTACTGTAGACATGGTACATGCGAAAGGTCAGAA 
GTAGGTATTTGCCTATCTACCAGTGGTAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGT 
TTTCTGTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTGCAACCTGTGGGTGCTT 
TAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTATTATTGCCATATTGGTGACTTGTGCTGCCTACTACTTT 
ATGAAATTCAGACGTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTTTTTGATGTC 
TTTCACTATACTCTGTCTGGTACCAGCTTACAGCTTTCTGCCGGGAGTCTACTCAGTCTTTTACTTGTACT 
TGACATTCTATTTCACCAATGATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT 
GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCACTGCCATTGGTTCTTTAACAA 
CTATCTTAGGAAAAGAGTCATGTTTAATGGAGTTACATTTAGTACCTTCGAGGAGGCTGCTTTGTGTACCT 
TTTTGCTCAACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTACACAGTATAACAGG 
TATCTTGCTCTATATAACAAGTACAAGTATTTCAGTGGAGCCTTAGATACTACCAGCTATCGTGAAGCAGC 
TTGCTGCCACTTAGCAAAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACCACCAC 
AGACATCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAATGGCATTCCCGTCAGGCAAAGTTGAA 
GGGTGCATGGTACAAGTAACCTGTGGAACTACAACTCTTAATGGATTGTGGTTGGATGACACAGTATACTG 
TCCAAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAGATCTGCTCATTCGCAAAT 
CCAACCATAGCTTTCTTGTTCAGGCTGGCAATGTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGT 
CTGCTTAGGCTTAAAGTTGATACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTATCCAACCTGG 
TCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACCATCTGGTGTTTATCAGTGTGCCATGAGACCTA 
ATCATACCATTAAAGGTTCTTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATTGATTATGATTGC 
GTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACACGCTGGTACTGACTTAGAAGGTAA 
ATTCTATGGTCCATTTGTTGACAGACAAACTGCACAGGCTGCAGGTACAGACACAACCATAACATTAAATG 
TTTTGGCATGGCTGTATGCTGCTGTTATCAATGGTGATAGGTGGTTTCTTAATAGATTCACCACTACTTTG 
AATGACTTTAACCTTGTGGCAATGAAGTACAACTATGAACCTTTGACACAAGATCATGTTGACATATTGGG 
ACCTCTTTCTGCTCAAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTGCAGAATG 
GTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGATGAGTTTACACCATTTGATGTTGTTAGA 
CAATGCTCTGGTGTTACCTTCCAAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTTTT 
AACTTTCTTGACATCACTATTGATTCTTGTTCAAAGTACACAGTGGTCACTGTTTTTCTTTGTTTACGAGA 
ATGCTTTCTTGCCATTTACTCTTGGTATTATGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAG 
CACGCATTCTTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATATGGTCTACATGCC 
TGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAATTGGCTGACACTAGCTTGTCTGGTTATAGGCTTA 
AGGATTGTGTTATGTATGCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGATGAT 
GCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTTACAAAGTCTACTATGGTAATGCTTT 
AGATCAAGCTATTTCCATGTGGGCCTTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTA 
TCATGTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGTTATTTATTACTGGCAAC 
ACCTTACAGTGTATCATGCTTGTTTATTGTTTCTTAGGCTATTGTTGCTGCTGCTACTTTGGCCTTTTCTG 
TTTACTCAACCGTTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAGAATTTAGGT 
ATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATTGATGCTTTCAAGCTTAACATTAAGTTGTTG 
GGTATTGGAGGTAAACCATGTATCAAGGTTGCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACATC 
TGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCTAAATTGTGGGCACAATGTGTAC 
AACTCCACAATGATATTCTTCTTGCAAAAGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCT 
GTTTTGCTATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTCGATAACCGTGCTAC 
TCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACCATCATATGCCGCTTATGCCACTGCCCAGGAGGCCT 
ATGAGCAGGCTGTAGCTAATGGTGATTCTGAAGTCGTTCTCAAAAAGTTAAAGAAATCTTTGAATGTGGCT 
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AAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCAAGTTGGAAAAGATGGCAGATCAGGCTATGACCCA 
AATGTACAAACAGGCAAGATCTGAGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCA 
CTATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGCGCGTGATGGTTGTGTTCCA 
CTCAACATCATACCATTGACTACAGCAGCCAAACTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAA 
CACTTGTGATGGTAACACCTTTACATATGCATCTGCACTCTGGGAAATCCAGCAAGTTGTTGATGCGGATA 
GCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATTCACCAAATTTGGCTTGGCCTCTTATTGTTACA 
GCTCTAAGAGCCAACTCAGCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGATGTC 
CTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCACTTGCCTACTATAACAATTCGAAGG 
GAGGTAGGTTTGTGCTGGCATTACTATCAGACCACCAAGATCTCAAATGGGCTAGATTCCCTAAGAGTGAT 
GGTACAGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACACACCAAAAGGGCCTAA 
AGTGAAATACTTGTACTTCATCAAAGGCTTAAACAACCTAAATAGAGGTATGGTGCTGGGCAGTTTAGCTG 
CTACAGTACGTCTTCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCTTCTGTGCT 
TTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCAAGTGGAGGACAACCAATCACCAACTGTGT 
GAAGATGTTGTGTACACACACTGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCAAG 
AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGACCATCCAAATCCTAAAGGATTC 
TGTGACTTGAAAGGTAAGTACGTCCAAATACCTACCACTTGTGCTAATGACCCAGTGGGTTTTACACTTAG 
AAACACAGTCTGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAACTCCGCGAACCCT 
TGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGGGTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGT 
GCGGCACAGGCACTAGTACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTGGTTTT 
GCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGATGAGGAAGGCAATTTATTAGACTCTTA 
CTTTGTAGTTAAGAGGCATACTATGTCTAACTACCAACATGAAGAGACTATTTATAACTTGGTTAAAGATT 
GTCCAGCGGTTGCTGTCCATGACTTTTTCAAGTTTAGAGTAGATGGTGACATGGTACCACATATATCACGT 
CAGCGTCTAACTAAATACACAATGGCTGATTTAGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGA 
TACATTAAAAGAAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAGGATTGGTATG 
ACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAACTTAGGTGAGCGTGTACGCCAATCATTATTA 
AAGACTGTACAATTCTGCGATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAGGA 
TCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACCAGGCTGCGGAGTTCCTATTGTGG 
ATTCATATTACTCATTGCTGATGCCCATCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGAT 
GCTGATCTCGCAAAACCACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGAAGAGAGACTTTGTCT 
CTTCGACCGTTATTTTAAATATTGGGACCAGACATACCATCCCAATTGTATTAACTGTTTGGATGATAGGT 
GTATCCTTCATTGTGCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTACAAGTTTTGGACCACTA 
GTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAACTGGATACCATTTTCGTGAGTTAGGAGT 
CGTACATAATCAGGATGTAAACTTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTG 
ATCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTACATGCTTTTCAGTAGCTGCA 
CTAACAAACAATGTTGCTTTTCAAACTGTCAAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGT 
GTCTAAAGGTTTCTTTAAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTCAGGATGGCAACG 
CTGCTATCAGTGATTATGACTATTATCGTTATAATCTGCCAACAATGTGTGATATCAGACAACTCCTATTC 
GTAGTTGAAGTTGTTGATAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAATCGT 
TAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATAAATGGGGTAAGGCTAGACTTTATTATGACTCAA 
TGAGTTATGAGGATCAAGATGCACTTTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATG 
AATCTTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTCTCTATCTGTAGTACTAT 
GACAAATAGACAGTTTCATCAGAAATTATTGAAGTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATTG 
GAACAAGCAAGTTTTACGGTGGCTGGCATAATATGTTAAAAACTGTTTACAGTGATGTAGAAACTCCACAC 
CTTATGGGTTGGGATTATCCAAAATGTGACAGAGCCATGCCTAACATGCTTAGGATAATGGCCTCTCTTGT 
TCTTGCTCGCAAACATAACACTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCGC 
AAGTATTAAGTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACCAGGTGGAACATCATCCGGTGAT 
GCTACAACTGCTTATGCTAATAGTGTCTTTAACATTTGTCAAGCTGTTACAGCCAATGTAAATGCACTTCT 
TTCAACTGATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTACAACACAGGCTCTATGAGTGTCTCT 
ATAGAAATAGGGATGTTGATCATGAATTCGTGGATGAGTTTTACGCTTACCTGCGTAAACATTTCTCCATG 
ATGATTCTTTCTGATGATGCCGTTGTGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGCTAGCAT 
TAAGAACTTTAAGGCAGTTCTTTATTATCAAAATAATGTGTTCATGTCTGAGGCAAAATGTTGGACTGAGA 
CTGACCTTACTAAAGGACCTCACGAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAGATGATTAC 
GTGTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTGTCGATGATATTGTCAAAAC 
AGATGGTACACTTATGATTGAAAGGTTCGTGTCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTA 
ATCAGGAGTATGCTGATGTCTTTCACTTGTATTTACAATACATTAGAAAGTTACATGATGAGCTTACTGGC 
CACATGTTGGACATGTATTCCGTAATGCTAACTAATGATAACACCTCACGGTACTGGGAACCTGAGTTTTA 
TGAGGCTATGTACACACCACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGACTT 
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CACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAGTGCTGCTATGACCATGTCATTTCA 
ACATCACACAAATTAGTGTTGTCTGTTAATCCCTATGTTTGCAATGCCCCAGGTTGTGATGTCACTGATGT 
GACACAACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCCATTAGTTTTCCATTAT 
GTGCTAATGGTCAGGTTTTTGGTTTATACAAAAACACATGTGTAGGCAGTGACAATGTCACTGACTTCAAT 
GCGATAGCAACATGTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAGAGACTCAA 
GCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATTTAAGCTGTCATATGGTATTGCCACTGTAC 
GCGAAGTACTCTCTGACAGAGAATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACCATTGAACAGA 
AACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTAAAGTACAGATTGGAGAGTACACCTTTGAAAA 
AGGTGACTATGGTGATGCTGTTGTGTACAGAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTG 
TGTTGACATCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCACTATGTGAGAATT 
ACTGGCTTGTACCCAACACTCAACATCTCAGATGAGTTTTCTAGCAATGTTGCAAATTATCAAAAGGTCGG 
CATGCAAAAGTACTCTACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACTTGCTC 
TCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATGCAGCTGTTGATGCCCTATGTGAAAAG 
GCATTAAAATATTTGCCCATAGATAAATGTAGTAGAATCATACCTGCGCGTGCGCGCGTAGAGTGTTTTGA 
TAAATTCAAAGTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATGCATTGCCAGAAACAACTG 
CTGACATTGTAGTCTTTGATGAAATCTCTATGGCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTT 
CGTGCAAAACACTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGCTGACTAAAGG 
CACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTTATGAAAACAATAGGTCCAGACATGTTCCTTG 
GAACTTGTCGCCGTTGTCCTGCTGAAATTGTTGACACTGTGAGTGCTTTAGTTTATGACAATAAGCTAAAA 
GCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGTTCTACAAAGGTGTTATTACACATGATGTTTCATC 
TGCAATCAACAGACCTCAAATAGGCGTTGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTG 
TTTTTATCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTGCCTACGCAGACTGTT 
GATTCATCACAGGGTTCTGAATATGACTATGTCATATTCACACAAACTACTGAAACAGCACACTCTTGTAA 
TGTCAACCGCTTCAATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGATAGAGATC 
TTTATGACAAACTGCAATTTACAAGTCTAGAAATACCACGTCGCAATGTGGCTACATTACAAGCAGAAAAT 
GTAACTGGACTTTTTAAGGACTGTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCT 
CAGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGGCATACCAAAGGACATGACCT 
ACCGTAGACTCATCTCTATGATGGGTTTCAAAATGAATTACCAAGTCAATGGTTACCCTAATATGTTTATC 
ACCCGCGAAGAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTGTCATGCAACTAG 
AGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGATTTTCTACAGGTGTTAACTTAGTAGCTGTACCGA 
CTGGTTATGTTGACACTGAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGACCAG 
TTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATGTAGTGCGTATTAAGATAGTACAAAT 
GCTCAGTGATACACTGAAAGGATTGTCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTA 
CATCAATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTGACAAACGTGCAACTTGC 
TTTTCTACTTCATCAGATACTTATGCCTGCTGGAATCATTCTGTGGGTTTTGACTATGTCTATAACCCATT 
TATGATTGATGTTCAGCAGTGGGGCTTTACGGGTAACCTTCAGAGTAACCATGACCAACATTGCCAGGTAC 
ATGGAAATGCACATGTGGCTAGTTGTGATGCTATCATGACTAGATGTTTAGCAGTCCATGAGTGCTTTGTT 
AAGCGCGTTGATTGGTCTGTTGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAAA 
AGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCAGTTCTTCATGACATTGGAAATC 
CAAAGGCTATCAAGTGTGTGCCTGAGGCTGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGAC 
AAAGCTTACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTCACTGATGGTGTTTG 
TTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGCCAATGCAATTGTGTGTAGGTTTGACACAAGAGTCT 
TGTCAAACTTGAACTTACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCACACTCCA 
GCTTTCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTTCTTTTACTATTCTGATAGTCCTTGTGA 
GTCTCATGGCAAACAAGTAGTGTCGGATATTGATTATGTTCCACTCAAATCTGCTACGTGTATTACACGAT 
GCAATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTACTTGGATGCATATAATATG 
ATGATTTCTGCTGGATTTAGCCTATGGATTTACAAACAATTTGATACTTATAACCTGTGGAATACATTTAC 
CAGGTTACAGAGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGGACACGCCGGCG 
AAGCACCTGTTTCCATCATTAATAATGCTGTTTACACAAAGGTAGATGGTATTGATGTGGAGATCTTTGAA 
AATAAGACAACACTTCCTGTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTAAACCAGTGCCAGA 
GATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTGTAATCTGGGACTACAAAAGAGAAG 
CCCCAGCACATGTATCTACAATAGGTGTCTGCACAATGACTGACATTGCCAAGAAACCTACTGAGAGTGCT 
TGTTCTTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTTTTAGAAACGCCCGTAA 
TGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGTCTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCA 
ATGGAGTCACATTAATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACGGCATTATT 
CAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTAGAGGATTTTAAGCCCAGATCACAAATGGA 
AACTGACTTTCTCGAGCTCGCTATGGATGAATTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGAAC 
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ACATCGTTTATGGAGATTTCAGTCATGGACAACTTGGCGGTCTTCATTTAATGATAGGCTTAGCCAAGCGC 
TCACAAGATTCACCACTTAAATTAGAGGATTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAAC 
AGATGCGCAAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGATGACTTTGTCGAGA 
TAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGTGGTCAAGGTTACAATTGACTATGCTGAAATTTCA 
TTCATGCTTTGGTGTAAGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCAAGCGTGGCA 
ACCAGGTGTTGCGATGCCTAACTTGTACAAGATGCAAAGAATGCTTCTTGAAAAGTGTGACCTTCAGAATT 
ATGGTGAAAATGCTGTTATACCAAAAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATAC 
TTAAATACACTTACTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGGTGCTGGCTCTGATAAAGG 
AGTTGCACCAGGTACAGCTGTGCTCAGACAATGGTTGCCAACTGGCACACTACTTGTCGATTCAGATCTTA 
ATGACTTCGTCTCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGCTAATAAATGG 
GACCTTATTATTAGCGATATGTATGACCCTAGGACCAAACATGTGACAA?^AGAGAATGACTCTAAAGAAGG 
GTTTTTCACTTATCTGTGTGGATTTATAAAGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGATAA 
CAGAGCATTCTTGGAATGCTGACCTTTACAAGCTTATGGGCCATTTCTCATGGTGGACAGCTTTTGTTACA 
AATGTAAATGCATCATCATCGGAAGCATTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAACAAAT 
TGATGGCTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCCAGTTGTCTTCCTATT 
CACTCTTTGACATGAGCAAATTTCCTCTTAAATTAAGAGGAACTGCTGTAATGTCTCTTAAGGAGAATCAA 
ATCAATGATATGATTTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAGTTGTGGT 
TTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATTTTCTTATTATTTCTTACTCTCACTAGTG 
GTAGTGACCTTGACCGGTGCACCACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCT 
ATGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTATTTAACTCAGGATTTATTTCT 
TCCATTTTATTCTAATGTTACAGGGTTTCATACTATTAATCATACGTTTGGCAACCCTGTCATACCTTTTA 
AGGATGGTATTTATTTTGCTGCCACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTTGGTTCTACCATG 
AACAACAAGTCACAGTCGGTGATTATTATTAACAATTCTACTAATGTTGTTATACGAGCATGTAACTTTGA 
ATTGTGTGACAACCCTTTCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTCGATA 
ATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCTTGATGTTTCAGAAAAGTCAGGTAAT 
TTTAAACACTTACGAGAGTTTGTGTTTAAAAATAAAGATGGGTTTCTCTATGTTTATAAGGGCTATCAACC 
TATAGATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGAAACCTATTTTTAAGTTGCCTCTTGGTA 
TTAACATTACAAATTTTAGAGCCATTCTTACAGCCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCT 
GCAGCCTATTTTGTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGGTACAATCAC 
AGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGAACTCAAATGCTCTGTTAAGAGCTTTGAGATTGACA 
AAGGAATTTACCAGACCTCTAATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTACA 
AACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTGTCTATGCATGGGAGAGAAAAAA 
AATTTCTAATTGTGTTGCTGATTACTCTGTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATG 
GCGTTTCTGCCACTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTTTTGTAGTCAAGGGA 
GATGATGTAAGACAAATAGCGCCAGGACAAACTGGTGTTATTGCTGATTATAATTATAAATTGCCAGATGA 
TTTCATGGGTTGTGTCCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATAATTATA 
AATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGAGACATATCTAATGTGCCTTTCTCCCCT 
GATGGCAAACCTTGCACCCCACCTGCTCTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCAC 
TACTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTTTTAAATGCACCGGCCACGG 
TTTGTGGACCAAAATTATCCACTGACCTTATTAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACT 
GGTACTGGTGTGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGTGATGTTTCTGA 
TTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAATATTAGACATTTCACCTTGCGCTTTTGGGGGTG 
TAAGTGTAATTACACCTGGAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGCACT 
GATGTTTCTACAGCAATTCATGCAGATCAACTCACACCAGCTTGGCGCATATATTCTACTGGAAACAATGT 
ATTCCAGACTCAAGCAGGCTGTCTTATAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTA 
TTGGAGCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAGCCAAAAATCTATTGTG 
GCTTATACTATGTCTTTAGGTGCTGATAGTTCAATTGCTTACTCTAATAACACCATTGCTATACCTACTAA 
CTTTTCAATTAGCATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCTCCGTAGATTGTAATATGT 
ACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCCAATATGGTAGCTTTTGCACACAACTAAAT 
CGTGCACTCTCAGGTATTGCTGCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAAT 
GTACAA2^CCCCAACTTTG2y\ATATTTTGGTGGTTTTAATTTTTCACAAATATTACCTGACCCTCTAAAGC 
CAACTAAGAGGTCTTTTATTGAGGACTTGCTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAG 
CAATATGGCGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGTTCAATGGACTTAC 
AGTGTTGCCACCTCTGCTCACTGATGATATGATTGCTGCCTACACTGCTGCTCTAGTTAGTGGTACTGCCA 
CTGCTGGATGGACATTTGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAA.TGGCATATAGGTTC 
AATGGCATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAACAAATCGCCAACCAATTTAACAAGGC 
GATTAGTCAAATTCAAGAATCACTTACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACC 
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AGAATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGTGCAATTTCAAGTGTGCTA 
AATGATATCCTTTCGCGACTTGATAAAGTCGAGGCGGAGGTACAAATTGACAGGTTAATTACAGGCAGACT 
TCAAAGCCTTCAAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCTGCTAATCTTG 
CTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAAAAGAGTTGACTTTTGTGGAAAGGGCTACCAC 
CTTATGTCCTTCCCACAAGCAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAGGA 
GAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATACTTCCCTCGTGAAGGTGTTTTTG 
TGTTTAATGGCACTTCTTGGTTTATTACACAGAGGAACTTCTTTTCTCCACAAATAATTACTACAGACAAT 
ACATTTGTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGTTTATGATCCTCTGCAACC 
TGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGTACTTCAAAAATCATACATCACCAGATGTTGATCTTG 
GCGACATTTCAGGCATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGAGGTCGCT 
AAAAATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAAAATATGAGCAATATATTAAATGGCCTTG 
GTATGTTTGGCTCGGCTTCATTGCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTa?GCATGA 
CTAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCAAGTTTGATGAGGATGACTCT 
GAGCCAGTTCTCAAGGGTGTCAAATTACATTACACATAAACGAACTTATGGATTTGTTTATGAGATTTTTT 
ACTCTTAGATCAATTACTGCACAGCCAGTAAAAATTGACAATGCTTCTCCTGCAAGTACTGTTCATGCTAC 
AGCAACGATACCGCTACAAGCCTCACTCCCTTTCGGATGGCTTGTTATTGGCGTTGCATTTCTTGCTGTTT 
TTCAGAGCGCTACCAAAATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCAGTTC 
ATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTTTGCTTGTCGCTGCAGGTATGGAGGC 
GCAATTTTTGTACCTCTATGCCTTGATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGAT 
GTTGGCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACTACTTTGTTTGCTGGCAC 
ACACATAACTATGACTACTGTATACCATATAACAGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGG 
CATTTCAACACCAAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACTCAGGTGTTA 
AAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTACTACCAGCTTGAGTCTACACAAATTACTACA 
GACACTGGTATTGAAAATGCTACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAAT 
ACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGCAATGGATCCAATTTATGATGAGCCGACGACGA 
CTACTAGCGTGCCTTTGTAAGCACAAGAAAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACA 
GGTACGTTAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTAGTCACACTAGCCAT 
CCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAATATTGTTAACGTGAGTTTAGTAAAACCAACGGTTT 
ACGTCTACTCGCGTGTTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCTGGTCTAAACGAACTAA 
CTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATGGCAGACAACGGTACTATTACCGTTGAG 
GAGCTTAAACAACTCCTGGAACAATGGAACCTAGTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACT 
ACAATTTGCCTATTCTAATCGGAACAGGTTTTTGTACATAATAAAGCTTGTTTTCCTCTGGCTCTTGTGGC 
CAGTAACACTTGCTTGTTTTGTGCTTGCTGCTGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATT 
GCAATGGCTTGTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTGTTTGCTCGTAC 
CCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCTTCTCAATGTGCCTCTCCGGGGGACAATTGTGA 
CCAGACCGCTCATGGAAAGTGAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACTTGCGAATGGCCGGA 
CACTCCCTAGGGCGCTGTGACATTAAGGACCTGCCAAAAGAGATCACTGTGGCTACATCACGAACGCTTTC 
TTATTACAAATTAGGAGCGTCGCAGCGTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTA 
TTGGAAACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTTGCTAGTACAGTAAGTG 
ACAACAGATGTTTCATCTTGTTGACTTCCAGGTTACAATAGCAGAGATATTGATTATCATTATGAGGACTT 
TCAGGATTGCTATTTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGCCTCTAACT 
AAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAGTTAGATTATCCATAAAACGAACATGAAAA 
TTATTCTCTTCCTGACATTGATTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 
ACGACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAATTCACCATTTCACCCTCTTGC 
TGACAATAAATTTGCACTAACTTGCACTAGCACACACTTTGCTTTTGCTTGTGCTGACGGTACTCGACATA 
CCTATCAGCTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGTTCAACAAGAGCTC 
TACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTATTTTTAATACTTTGCTTCACCATTAAGAGAAAGAC 
AGAATGAATGAGCTCACTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTTTTAAT 
AATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGAAGAACCTTGTACCAAAGTCTAAACGA 
ACATGAAACTTCTCATTGTTTTGACTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACAGCGCTGT 
GCATCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGGGTAATACTTATAGCACTG 
CTTGGCTTTGTGCTCTAGGAAAGGTTTTACCTTTTCATAGATGGCACACTATGGTTCAAACATGCACACCT 
AATGTTACTATCAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACCTTCATGAAGG 
TCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTAAATAAACGAACAAATTAAAATGTCTGATAAT 
GGACCCCAATCAAACCAACGTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAATAA 
CCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCAAGGTTTACCCAATAATACTGCGT 
CTTGGTTCACAGCTCTCACTCAGCATGGCAAGGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATC 
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AACACCAATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACGAGTTCGTGGTGGTGA 
CGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACTTCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTC 
CCTACGGCGCTAACAAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAAAGACCAC 
ATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTACAACTTCCTCAAGGAACAACATTGCCAAA 
AGGCTTCTACGCAGAGGGAAGCAGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTA 
ATTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAATGGCTAGCGGAGGTGGTGAA 
ACTGCCCTCGCGCTATTGCTGCTAGACAGATTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACA 
ACAACAAGGCCAAACTGTCACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCCAAAAACGTACTG 
CCACAAAACAGTACAACGTCACTCAAGCATTTGGGAGACGTGGTCCAGAACAAACCCiy^GGAAATTTCGGG 
GACCAAGACCTAATCAGACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAATTTGCTCCAAGTGC 
CTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCTTCGGGAACATGGCTGACTTATCATG 
GAGCCATTAAATTGGATGACAAAGATCCACAATTCAAAGACAACGTCATACTGCTGAACAAGCACATTGAC 
GCATACAAAACATTCCCACCAACAGAGCCTAAAAAGGACAAAAAGAAAAAGACTGATGAAGCTCAGCCTTT 
GCCGCAGAGACAAAAGAAGCAGCCCACTGTGACTCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGAC 
AACTTCAAAATTCCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATGACCACACAA 
GGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACGATACATAGTCTACTCTTGTGCAGAATGAAT 
TCTCGTAACTAAACAGCACAAGTAGGTTTAGTTAACTTTAATCTCACATAGCAATCTTTAATCAATGTGTA 
ACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCATCGAGGCCACGCGGAGTACGATCGAGGGTACAGT 
GAATAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCC 
CCATGTGATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAAA 

GenBank Accession No. AY274119,2.; SEQ ID NO: 2 
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ERV-2 

T0R2 ACACTCATGATGACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTA 
AIBV 



ERV-2 ^ 

TOR2 CGATACATAGTCTACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACAAGTAGGTT 
AIBV 



ERV-2 ACCCGTTACCCTAAAATTCCCTCC 

T0R2 TAGTTAACTTTAATCTCACATAGCAATCTTTAATCAATGTGTAACATTAGGGAGGACTTG 

AIBV TAGTTTAGTTTAAGTTAGTTTAG 

* * ** * 

ERV-2 CCTTTCTCTTCAC TCGCCGAGGCCACGCCGAGTAGGACCGAG6GTACAGC 

T0R2 AAAGAGCCACCACATTT — TC ATCGAGGCC ACGCGGAGTACGATCGAGGGTAC AGT 

AIBV AGTAGGTATAAAGATGCCAGTGCCGGGGCCACGCGGAGTACGATCGAGGGTACAGCACTA 

-* :jlr-* ******** ***** ** *********** 

ERV- 2 -GAGTCTTT-TAGTTTAAGGTGT-TAGATGTAAGGTACGTGGGCTTTCT — TTTGGTTTA 

T0R2 -GAATAATGCTAGGGAGAGCTGCCTATATGGAAGAGCCCTAATGTGTAAAATTAATTTTA 

AIBV GGACGCCCATTAGGGGAAGA-GCTAAATTTTAGTTTAAGTTAAGTTTAA TTGGCTAA 

** *** ****** * ** * ** 

ERV-2 CTTCTTC — GenBank: AF361253 {SEQ ID NO: 31) 

T0R2 GTAGTGCTATCCCCATGTGATTTTAATAGCTTCTTAGGAGAATGAC (SEQ ID NO: 18) 

AIBV GTATAGTTAAAATTTATAGGCTAGTATAGAGTTAGAGCA GenBank: NC_0 01451 (SEQ ID NO : 32) 
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MFIFLLFLTLTSGSDLDRCTTFDDVQAPNYTQHTSSMRGVYYPDEIFRSD 

TLYLTQDLFLPFYSNVTGFHTI]N[HTFGNPVIPFKDGIYFAATEKSNWRG 

WVT'GSTMNISIXSQS VI I INNSTNWIRACNFELCDNPFFAVSKPMGT 

MIFDNAFNCTFEYISDAFSLDVSEKSGNFKHLREFVFKNKDGFLYVYKGY 

QPIDWRDLPSGFNTLKPIFKLPLGINITNFRAILTAFSPAQDIWGTSAA 

AYFVGYLKPTTFMLKYDENGTITDAVDCSQNPLAEIiKCSVKSFEIDKGIY 

QTSNFRWPSGDWRFPNITNLCPFGEVFNATKFPSVYAWERKKISNCVA 

DYSVLYNSTFFSTFKCYGVSATKLNDLCFSNVYADSFWKGDDVRQIAPG 

QTGVlADYNYKLPDDFMGCVIiAlWTRNIDATSTGISra^ 

FERDISNVPFSPDGKPCTPPALiNCYWPLiNDYGFYTTTGIGYQPYRWVLS 

FELLNAPATVCGPKLSTDLIKNQCVNFNFNGLTGTGVLiTPSSKRFQPFQQ 

FGRDVSDFTDSVRDPKTSEILDISPCAFGGVSVITPGTNASSEVAVLYQD 

WCTDVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLIGAEHVDTSYECDI 

PIGAGICASYHWSLLRSTSQKSIVAYTMSLGADSSIAYSNNTIAIPTNF 

SISITTEVMPVSMAKTSVDCNiyiYICGDSTECANLLLQYGSFC 

GIAAEQDRNTREVFAQVKQMYKTPTLKYFGGFNFSQILPDPLKPTKRSFI 

EDLLiFNKVTLADAGFMKQYGECLGDINARDLICAQKFNGLTVIiPPLLTDD 

MIAAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYE 

NQKQIANQFNKAISQIQESLTTTSTALGKLQDWNQNAQALNTLVKQLSS 

NFGAISSVLNDILSRLDKVEAEVQIDRLITGRDQSLQTYVTQQLIRAAEI 

RASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQAAPHGWFLHVTYV 

PSQERNFTTAPAICHEGKAYFPREGVFVFNGTSWFITQRNFFSPQIITTD 

NTFVSGNCDWIGIIlSnSTTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGD 

ISGINASWNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYVWL 

GFIAGLIAIVMVTILLCCMTSCCSCLKGACSCGSCCKFDEDDSEPVLKGV 

KLHYT (SEQ ID NO : 33) 



Figure 5 



MADNGTITVEELKQLLEQWNLVIGFLFLAWIMLLQFAYSNRNRFLYIIKLi 
VFLWLLWPVTLACFVLAAWRINWTGGIAIAMACIVGLMWLSYFVASFR 
LFARTRSMWSFNPETNILLNVPLRGTIVTRPLMESELVIGAVIIRGHLRM 
AGHSLGRCDIKDLPKEITVATSRTLSYYKLGASQRVGTDSGFAAYNRYRI 
GNYKLNTDHAGSNDNIALLV (SEQ ID NO: 34) 
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MYSFVSEETGTLIVNSVLLFLAFWFLLVTLAILTALRLCAYCCNIVNVS 
LVKPTVYVYSRVKNLNSSEGVPDLLV (SEQ ID NO: 35) 



Figure 7 



MSDNGPQSNQRSAPRITFGGPTDSTDNNQNGGRNGARPKQRRPQGLPNNT 

ASWFTALTQHGKEELRFPRGQGVPINTNSGPDDQIGYYRRATRRVRGGDG 

KMKELSPRWYFYYLGTGPEASLPYGANKEGIVWVATEGALNTPKDHIGTR 

NPNNNAATVLQLPQGTTLPKGFYAEGSRGGSQASSRSSSRSRGNSRNSTP 

GSSRGNSPARMASGGGETALALLLLDRLNQLESKVSGKGQQQQGQTVTKK 

SAAEASKKPRQKRTATKQYNVTQAFGRRGPEQTQGNFGDQDLIRQGTDYK 

HWPQIAQFAPSASAFFGMSRIGMEVTPSGTWLTYHGAIKLDDKDPQFKDN 

VILLNKHIDAYKTFPPTEPKKDKKKKTDEAQPLPQRQKKQPTVTLLPAAD 

MDDFSRQLQNSMSGASADSTQA {SEQ ID NO: 36) 
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BoCov MSSVTTPAP— VyTWTADEAIKFtiKEWNFSIi 

OC43 MSSKTTPAP — VYIWTADEAIKFLKEWNFSL 

PHEV MSSPTTPVP — VISWTADEAIKFLKEWNFSIi 

FCV l^ILLILACAVACVYGEQIRYCAMQ-ETGLSCI^GTASDCESCFNGGDLIWHLAlSnAn^FS 

TGEV MKIIiLILACVlACACGE — RYCAMKSDTDIiSCRNSTASDCESCFNGGDIiIWHLANWNFSW 

T0R2_M MAD — NGTITVEELKQLLEQWWLVI 

ORF5 MAD--NGTITVEELKQLLEQWNLVI 

AIBV2 MMEN CTMLEQATLLFKEYNLFI 

AlBV MSNGTEN CTIjSTQQAAEIiFKEYNIiFI 

: : : : * : 

BoCov GIILLFXTVILQFGYTSRSMFVYVIK^IVILWriMWPIjTIILTIFJMCV — YALWN-VYtiGFS 

OC43 GIILLFITIILQFGYTSRSMFVYVIKMIILWIiMWPLTIILTIFNCV — YALNN-VYLGLS 

PHEV GIIVLFITIIX/QFGYTSRSMFVYVIKMVILWIiMWPLTlILTIFNCV — YALNN-VYLGFS 

FCV S IILIVFITVLQYGRPQFSWFVYGIKMLIMWLLWPIVLALiTIFNAYSEYEVSRYVMFGFS 

TGEV SI ILIVFITVLQYGRPQFSWFVYGIKMLIMWIiLWPVVLALTIFNAYSEYQVSRYVMFGFS 

T0R2_M GFLFIiAWIMLLQFAYSNRNRFIjYIIKLVFLWLLWPVTLACFVLAAV — YRINW-VTGGIA 

ORF5 GFLFLAWIMLLQFAYSNRIJRFLYIIKLVFLWLLWPVTLACFVLAAV — YRINW-VTGGIA 

AIBV2 TAFLLFLTIIiLQYGYATRSRFlYILKMIVLWCFWPLNIAVGVISCI — YPPNT-GGLVAA 

AlBV TAPLLFLTILIiQYGYATRSRFIYILKMIVLWCFWPIiNIAVGIISCI — YPPNT-GGLVAA 
:.: :**:.. .*:*:*::.:*:**:: ::. * . : 

BoCov IVFTrVAXIMWIVYPVNSIRLFIRTGSWWSFNPETNNLMCIDiy^ 

OC43 IVFTIVAIIMWIVYFVNSIRLFIRTGSFWSFNPETNNLMCIDMK-GTMYVRPIIEDYHTL 

PHEV IVFTI VAIIMWWYFVNSIRLFIRTGSWWSFNPETNNIiMCIDMK-GRMYVRPI lEDYHTIi 

FCV VAGAWTFALWMMYFVRS IQLYRRTKSWWSFNPETNAILCVNALi-GRSYVIiPLDGTPTGV 

TGEV lAGAIVTFVLWIMYFVRSIQLYRRTKSWWSFNPETKAILCVSAIi-GRSYVLPLEGVPTGV 

T0R2_M lAMACIVGLMWLSYFVASFRLFARTRSMWSFNPETNILLNVPLR-GTIVTRPLMESELVI 

ORF5 lAMACIVGLMWLSYPVASPRLFARTRSMWSFNPETNILLNVPLR-GTIVTRPLMESELVI 

AIBV2 IILTVFACLSFVGYWIQSCRLFKRCRSWWSFNPESNAVGSXLLTNGQQCNFAIESVPMVL 

AlBV IILTVFACLSFVGYWIQSFRLFKRCRSWWSFNPESNAVGSILLTNGQQCNFAIESVPMVL 
; ;.. ::*::*:*:* *******... . * ^ . . 



BoCov TVTIIRGHLYMQGIKLGTGYSLSDLPAYVTVAKVSHLLTYKR GFIiDKIGDTSGFAVY 

OC43 TVTIIRGHLYIQGIKLGTGYSWADLPAYMTVAKVTHLCTYKR — -GFLDRISDTSGFAVY 

PHEV TATIIRGHIiYIQGIKLGTGYSLSDIiPAYVTVAKVTHLCTYKR- — GPIiDRIGDTSGFAVY 

FCV TLTLIiSGNIjYAEGFKMAGGIiTIEHLPKYVMIRTPNRTIVYTIiV — GKQLKATTATGWAYY 

TGEV TIiTLLSGNLYAEGFKIAGGMNIDNLPKYVMVALPSRTIVYTLV — GKKLKASSATGWAYY 

T0R2_M GAVIIRGHLRMAGHSLGR-CDIKDLPKEITVAT-SRTLSYYKL — GASQRVGTDSGFAAY 

ORF5 GAVIIRGHLRMAGHSLGR-CDIKDLPKEITVAT-SRTLSYYKL — GASQRVGTDSGFAAY 

AIBV2 APIIKNGVIiYCEGQWLAK-CEPDHLPKDIFVCTPDRRNIYRMVQKYTGDQSGNKKRVATF 

AlBV SPIIKNGALYCEGQWLAK-CEPDHLPKDIFVCTPDRKNIYRMVQKYTGDQSGNKKRPATF 

I * * ^ it* . . . -k . ^ * • 

BoCov VKSKVGMYRLPSTQKGSGLDTALLRNNX 

OC43 VKSKVGNYRLPSTQKGSGMDTALLRNNI 

PHEV VKSKVGNYRLPSTHKGSGMDTALLRNNI 

FCV VKSKAGDYSTEARTDNLSEHEKLLHMV- 

TGEV VKSKAGDYSTEARTDNLSEQEKLLHMV- 

T0R2_M NRYRIGNYKLNTDHAGSNDNIALLVQ — 

ORF5 NRYRIGNYKLNTDHAGSNDNIALLVQ — 

AIBV2 VYAKQSVDTGELESVPTGGSSLYT 

AlBV VYAKQSVDTGELGSVATGGSSLiYT 

Key Name Genbank %ID 

PHEV Porcine hemagglutinating encephalomyelitis virus AAL80035 40.4% (SEQ ID NO: 37) 

BoCov matrix protein [Bovine coronavirus] . NP_150082 40.0% {SEQ ID NO: 38) 

AlBV membrane protein [Avian infectious bronchitis virus]. AAF35863 31.3% (SEQ ID NO: 39) 

TGEV membrane protein [Transmissible gastroenteritis virus]. NP_058427 28.5% (SEQ ID NO: 40) 

FCV membrane [feline coronavirus], BAC01160 27.7% (SEQ ID NO: 41) 

OC43 membrane glycoprotein [Human coronavirus OC43]. AAA45462 39.1% (SEQ ID NO: 42) 

A1BV2 membrane protein [Avian infectious bronchitis virus] . AAK83027 32.0% (SEQ ID NO: 43) 
TOR2_M/ORF 5 Sars associated coronavirus M glycoprotein (SEQ ID MO: 34) 
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BoCov 

OC43 

PHEV 

MHV 

AIBV2 

TCV 

AIBV 

FCV 

PTGV 

229E 

T0R2__N 



MSFTPGKQSS-SRASSGNRSGNGIIiK WADQSDQSRNVQTRGRRAQP — KQTATSQQP 

MSFTPGKQSS-SRASSGNRSGNGILK WADQSDQVRNVQTRGRRAQP — KQTATSQQP 

MSFTPGKQSS-SRASSGNRSGNGILK WADQSDQSRNVQTRGRRVQS — KQTATSQQP 

MSFVPGQENAGSRSSSVNRAGNGILKKTTWADQTERGPNNQNRGRRNQP — KQTATTQ-P 

MASGKAAGK TDAPAPVIK LGGPKPP — KVGSSGN — 

MASGICATGK TDAPAPIIK -LGGPKPP — KVGSSGN — 

MASGKAAGK TDAPAPVIK LGGPKPP — KVGSSGN — 

MATQGQRVN WGDEPSKRR GRSNSR — GRKNNDIP- 

MANQGQRVS WGDE S TKTR GRSNSR — GRKNNNI P - 

MATVK WADASEPQR GRQ — GRIPYS L- - 

MSDWGPQSNQRSAPRITFGGPTDSTDNNQNGGRNGARPKQRRPQGLPN 



BoCov SGGWVPYYSWFSGITQFQKGKEFEFAEGQGVPIAPGVPATEAKGYWYRHNRRSFKTADG 

OC 4 3 SGGNWPYYSWFSGITQFQKGKEFEFVEGQGPPIAPGVPATEAKGYWYRHNRGSFKTADG 

PHEV SGGTWPYYSWFSGITQFQKGKEFEFAEGQGVPIAPGVPSTEAKGYWYRHNRRSFKTADG 

MHV NSGSWPHYSWFSGITQFQKGKEFQFAQGQGVPIANGIPASEQKGYWYRHNRRSFKTPDG 

AIBV2 AS WFQAIKAKKLNTPPPKFEGSGVPDNENIKPSQQHGYWRRQAR — FKPGKG 

TCV AS WFQSIKAKKLNSPQPKFEGSGVPDNENIKTSQQHGYWRRQAR— FKPGKG 

AIBV AS WPQALKAKKLNAPAPKFEGSGVPDNENLKI SQQHGYWRRQAR — YKPGKG 

FCV LS YFNPITLDQGSKFWWLCPRDFVPKGIGNK-DQQIGYWNRQAR — YRIVKG 

PTGV LS FFNPITLQQGSKFWNLCPRDFVPKGIGNR-DQQIGYWNRQTR — YRMVKG 

2 2 9E - Y SPLLVDS-EQPWKVIPRNLVPINKKDK-NKLIGYWNVQKR— FRTRKG 

T0R2_N NTAS WFTALTQHG-KEELRFPRGQGVPINTNSGPDDQIGYYRRATRR-VRGGDG 

: .* *:.* 



BoCov NQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVYWVASNQADVNTPADILDRDPSSDEAIPT 

OC43 NQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVYWVASNQADVNTPADIVDRDPSSDEAIPT 

PHEV NQRQLLPRWYFYYLGTGPHAKDQYGTDIDGVFWVASNQADINTPADIVDRDPSSDEAIPT 

MHV QQKQLLPRWYFYYLGTGPHAGAEYGDDIDGWWVASQQADTKTTADIVERDPSSHEAIPT 

AIBV2 GRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVAAKGADTKSRSNQGTRDPDKFDQYPL 

TCV GRKPVPDAWYFYYTGTGPAADLNWGDTQDGIVWVAAKGADVKSRSNQGTRDPDKFDQYPL 

AIBV GRKPVPDAWYFYYTGTGPAADLNWGDSQDGIVWVAAKGADVKSRSNQGTRDPDKFDQYPL 

FCV QRVELPERWFFYFLGTGPHADAKFKAKIDGVFWVARDGAMN-KPTSLGTRG-TNNESKPL 

PTGV QRKELPERWFFYYLGTGPHADAKFKDKLDGWWVAKDGAMN-KPTTLGSRG-ANNESKAL 

2 2 9E KRVDLSPKLHFYYLGTGPHKDAKFRERVEGWWVAVDGAKT-EPTGYGVRR-KNSEPEIP 

T0R2_N KMKELSPRWYFYYLGTGPEASLPYGANKEGIVOTATEGALNTPKDHIGTRNPNmAATVL 

. **** : **ie ie * ^ 



BoCov RFPPGTVLPQGYYIEGS-GRSAPNSRSTSRASSRASSA GSRSRANSGNR TPTSG 

OC43 RFPPGTVIiPQGYYIEGS-GRSAPNSRSTSRTSSRASSA GSRSRANSGNR TPTSG 

PHEV RFPPGTVLPQGYYIEGS-GRSAPNSRSTSRAPNRAPSA GSRSRANSGNR TSTPG 

MHV RFAPGTVLPQGFYVEGS-GRSAPASRSGSRSQSRGP NNRARSSSNQR QPAST 

AXBV2 RFSDG — GPDGNFRWDF-IPLKNRGRSG-RSTAASSAA ASRAPSREGSR GRRSD 

TCV RFSDG — GPDSNFRWDF-IPLH-RGRSG~RSTAASSAA-^-SSRAPSRDGSR GRRSG 

AIBV RFSDG — GPDGNFRWDF-IPLM-RGRSG-RSTAASSAA SSRAPSREGSR GRLNG 

FCV KFDGK>-IPPQFQLEVNR-SRNNSRSGSQSRSVSRNRS -QSRGRQQSNNQ — NTNVED 

PTGV KFDGK-VPGEFQLEVNQ-SRDNSRLRSQSRSRSRNRS QSRGRQQSNNKK-DDSVEQ 

2 2 9E HFNQK — LPNGVTWEE-PDSRAPSRSQSRSQSRGRGESKPQSRNPSSDRNHNSQDDIMK 

TOR2_N QLPQGTTLPKGFYAEGSRGGSQASSRSSSRSRGNSRNSTPGSSRGNSPARMAS-GGGETA 



BoCov VTPDMADQIASLVLAKLGKDAAKP QQVTKQTAKEIRQK--IL 

OC 4 3 VTPDMADQI ASLVLAKLGKDATKP QQVTKHTAKEVRQK — I L 

PHEV VTPDMADQIASLVLAKLGKDATKP QQVTKQTAKEVRQK — IL 

MHV VKPDMAEEIAALVLAKLGKDAGQP KQVTKQSAKEVRQK — IL 

AIBV2 SGDDLIARAAKIIQDQQKKGS RITKAKADEMAHR — RY 

TCV SEDDLIARAAKIIQDQQKKGS RITKAKADEMAHR-RY 

AIBV AEDDLIARAAKIIQDQQKKGS RITKAKAEEMIHR- -RY 

FCV TIVAVLQKLGVTDK QRSRSKS GERSQSKSRDTTPK — NA 

PTGV AVIiAALKKLGVYTEKQQQRSRSKS KERSNSKIRDTTPK — NE 

2 2 9E AVAAALKSLGFDKPQEKDKKSAKTGTPKPSRNQSPASSQTSAKSLARSQSSETKEQKHEM 

TOR2_N liALLLLDRLNQLESKVSGKGQQQQG QTVTKKSAAEASKK — PR 
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BoCov NKPRQKRSPNKQCT — VQQCFGKR GPNQNFGGGElMLKLGTSDPQFPILAEIiAPTAGA 

OC4 3 NKPRQKRSPNKQCT — VQQCFGKR GPNQNFGGGEMLKLGTSDPQFPILAELAPTAGA 

PHEV NKPRQKRSPNKQCT — VQQCFGKR GPNQNFGGGEMLKLGTSDPQFPILAELAPTAGA 

MHV NKPRQKRTPNKQCP — VQQCFGKR GPNQNFGGSEMLKLGTSDPQFPILAELAPTPSA 

AIBV2 CK RTIPPNYR — VDQVFGPRT-KGKEGNFGDDKMNEEGIKDGRVTAMLNLVPS SHA 

TCV CK RTVPPGYK — VDQVPGPRT-KGKEGNPGDDKMNEEGIKDGRVTAMLNLVPSSHA 

AIBV CK RTVPPGVS — IDKVFGPRT-KGKEGNFGDDKMNEEGIKDGRVTAMLNLVPSSHA 

FCV NKHTWKKTAGKGD VTNFYGAR SSSANFGDSDLVANGNAAKCYPQIAECVPSVSS 

PTGV NKHTWKRTAGKGD VTRFYGTR SNSANFGDSDLVANGS S AKHYPQLAECVPS VS S 

2 2 9 E QKPRWKRQPNDDVTSNVTQCFGPR DLDIiNFGSAGWANGVKAKGYPQFAELVPSTAA 

TOR2jSr QK RTATKQYN — VTQAFGRRGPEQTQGNFGDQDLIRQGTDYKHWPQIAQFAPSASA 

* ; : . :* * , ***, : * . : : ,*: : 



BoCov FFFGSRIiELAKVQNIjSGNLDEPQKDVYELRYNGAIR FDSTLSGFETIMKVLNENL 

OC 4 3 FPFGSRIiELAKVQNLSGNPDEPQKDVYELRYNGAIR FDSTLSGFETIMKVLNENL 

PHEV FFFGSRLELAKVQNLSGNPDEPQKDVYELRYNGAIR FDSTLSGFETIMKVLNQNIi 

MHV FPFGSKLEIjVKKN — SGGADDPTKDVYEIjQYSGAIR FDSTLPGPETIMKVLNENL 

AIBV2 CLFGSRVTPKLQL — DGLHLRFEFTTWPCDDPQFDNYVKICDQCVDGVGTRPKDDEPKP 

TCV CLFGSRVTPKLQP — DGLHIiRFEFTTWPRDDPQPDNYVTICDQCVDGIGTRPKDNEPRP 

AIBV CLFGSQVTPKLQP — DGLHLTFRFTTWSRDDPQFDNYVKICDECVDGVGTRPKDEWRP 

FCV ILFGSQWSAEEAG — DQVKVTLTHNYYLPKDDAKTS QPLEQI 

PTGV ILFGSYWTSKEDG— DQIEVTFTHKYHLPKDDPKTG QFLQQI 

2 2 9 E MLFDSHIVSKESG — NTWLTFTTRVTVPKDHPHLG KFIiEEL 

TOR2_N FFGMSRIGMEVTP — SGTWLTYHGAIKLDDKDPQFK DN VILLNKHI 



BoCov 

OC43 

PHEV 

MHV 

AIBV2 

TCV 

AXBV 

FCV 

PTGV 

229E 

TOR2_N 



NAYQQQ-DGTMNMSPKPQRQRG QKNGQGENDNISVAAPKSRVQQNKIRELTAEDIS 

NAYQQQ-DGMMNMSPKPQRQRG HKNGQGENDNISVAVPKSRVQQNKSRELTAEDIS 

NAYQHQEDGMMNISPKPQRQRG QKNGQVENDNVSVAAPKSRVQQNKSRELTAEDIS 

DAYQDQAGGADWSPKPQRKRGT— KQKALKGEVDNVSVAKPKSSVQRNVSRELTPEDRS 
KSRS S SRPATRGNS PAPRQQRPK- - KEKKLKKQDDEADKAIiTSDEERNNAQLEFYDEP - K 
KSRP S SRPATRGNS PAPRQQRPK — KEKKPKKQDDEVDKALTSDEERNNAQ LEFDDEP - K 
KSRSSSRPATRGTSPAPKQQRPK — KEKKPKKQDDEVDKALTSDEERNNAQLEFDDEP-K 

DAYKRP SEVAKDQRQ — — RKSRSKSADKKPEELS — VTLEAYTDVFDDTQVE 

NAYARP SEVAKEQRK RKSRSKSAERSEQEWPDALIENYTDVFDDTQVE 

NAFTRE — MQQHP — LLNPSALEFNPSQTSPATAEPVRDEVSIET-D 

DAYKTFPP TEPKKDKKKKTDEAQPLPQRQKKQPTVTLLPAADMDDFSRQLQNSMSG 



BoCov LLKKMDEP FTEDTSEI 

OC43 LLKKMDEP YTEDTSEI 

PHEV LLKKMDEP YTEDTSEI 

MHV LLAQILDDGWPDGLEDDSNV 

AIBV2 VINWGDAA LGENEL — 

TCV VINWGDSA LGENHL — 

AIBV VINWGDSA LGENEL — 

FCV MIDEVTN 

PTGV MIDEVTN 

229E IIDEVN 

TOR2_N ASADSTQA 



Key 




Genbanlc 


*%ID 










MHV 


NUCLEOCAPSID PROTEIN 


P18446 


34.3% 


(SEQ 


ID 


NO: 


44} 


BoCov 


nucleocapsid protein [Bovine coronavirus] . 


NP_150083 


34.4% 


(SEQ 


ID 


NO: 


45) 


AIBV 


nucleocapsid protein [Avian infectious bronchitis virus] , 
nucleocapsid [Feline coronavirus] . 


AAK27162 


28.3% 


(SEQ 


ID 


NO: 


46) 


FCV 


CAA74230 


29.4% 


(SEQ 


ID 


NO: 


47) 


PTGV 


nucleoprotein [porcine transmissible gastroenteritis virus] . 


AAM97563 


28.0% 


(SEQ 


ID 


NO: 


48) 


229E 


nucleocapsid protein [Human coronavirus 22 9E] , 


NP_073556 


24.6% 


(SEQ 


ID 


NO: 


49) 


OC43 


NUCLEOCAPSID PROTEIN. 


P33469 


33.9% 


(SEQ 


ID 


NO: 


50) 


PHEV 


nucleocapsid protein [porcine hemagglutinating enceplialomyelitis] 


AAIi80036 


33.3% 


(SEQ 


ID 


NO: 


51) 


TCV 


nucleocapsid protein [turkey coronavirus] . 


AAF23873 


28.2% 


(SEQ ID NO: 


52) 


TOPJKT 


SARS associated virus nucleocapsid protein (SEQ ID NO; 36) 
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ATATTAGGTTTTTACCTACCCAGGAAAAGCCAACCAACCTCGATCTCTTG 
TAGATCTGTTCTCTAAACGAACTTTAAAATCTGTGTAGCTGTCGCTCGGC 
TGCATGCCTAGTGCACCTACGCAGTATAAACAATAATAAATTTTACTGTC 
GTTGACAAGAAACGAGTAACTCGTCCCTCTTCTGCAGACTGCTTACGGTT 
TCGTCCGTGTTGCAGTCGATCATCAGCATACCTAGGTTTCGTCCGGGTGT 
GACCGAAAGGTAAGATGGAGAGCCTTGTTCTTGGTGTCAACGAGAAAACA 
CACGTCCAACTCAGTTTGCCTGTCCTTCAGGTTAGAGACGTGCTAGTGCG 
TGGCTTCGGGGACTCTGTGGAAGAGGCCCTATCGGAGGCACGTGAACACC 
TCAAAAATGGCACTTGTGGTCTAGTAGAGCTGGAAAAAGGCGTACTGCCC 
CAGCTTGAACAGCCCTATGTGTTCATTAAACGTTCTGATGCCTTAAGCAC 
CAATCACGGCCACAAGGTCGTTGAGCTGGTTGCAGAAATGGACGGCATTC 
AGTACGGTCGTAGCGGTATAACACTGGGAGTACTCGTGCCACATGTGGGC 
GAAACCCCAATTGCATACCGCAATGTTCTTCTTCGTAAGAACGGTAATAA 
GGGAGCCGGTGGTCATAGCTATGGCATCGATCTAAAGTCTTATGACTTAG 
GTGACGAGCTTGGCACTGATCCCATTGAAGATTATGAACAAAACTGGAAC 
ACTAAGCATGGCAGTGGTGCACTCCGTGAACTCACTCGTGAGCTCAATGG 
AGGTGCAGTCACTCGCTATGTCGACAACAATTTCTGTGGCCCAGATGGGT 
ACCCTCTTGATTGCATCAAAGATTTTCTCGCACGCGCGGGCAAGTCAATG 
TGCACTCTTTCCGAACAACTTGATTACATCGAGTCGAAGAGAGGTGTCTA 
CTGCTGCCGTGACCATGAGCATGAAATTGCCTGGTTCACTGAGCGCTCTG 
ATAAGAGCTACGAGCACCAGACACCCTTCGAAATTAAGAGTGCCAAGAAA 
TTTGACACTTTCAAAGGGGAATGCCCAAAGTTTGTGTTTCCTCTTAACTC 
AAAAGTCAAAGTCATTCAACCACGTGTTGAAAAGAAAAAGACTGAGGGTT 
TCATGGGGCGTATACGCTCTGTGTACCCTGTTGCATCTCCACAGGAGTGT 
AACAATATGCACTTGTCTACCTTGATGAAATGTAATCATTGCGATGAAGT 
TTCATGGCAGACGTGCGACTTTCTGAAAGCCACTTGTGAACATTGTGGCA 
CTGAAAATTTAGTTATTGAAGGACCTACTACATGTGGGTACCTACCTACT 
AATGCTGTAGTGAAAATGCCATGTCCTGCCTGTCAAGACCCAGAGATTGG 
ACCTGAGCATAGTGTTGCAGATTATCACAACCACTCAAACATTGAAACTC 
GACTCCGCAAGGGAGGTAGGACTAGATGTTTTGGAGGCTGTGTGTTTGCC 
TATGTTGGCTGCTATAATAAGCGTGCCTACTGGGTTCCTCGTGCTAGTGC 
TGATATTGGCTCAGGCCATACTGGCATTACTGGTGACAATGTGGAGACCT 
TGAATGAGGATCTCCTTGAGATACTGAGTCGTGAACGTGTTAACATTAAC 
ATTGTTGGCGATTTTCATTTGAATGAAGAGGTTGCCATCATTTTGGCATC 
TTTCTCTGCTTCTACAAGTGCCTTTATTGACACTATAAAGAGTCTTGATT 
ACAAGTCTTTCAAAACCATTGTTGAGTCCTGCGGTAACTATAAAGTTACC 
AAGGGAAAGCCCGTAAAAGGTGCTTGGAACATTGGACAACAGAGATCAGT 
TTTAACACCACTGTGTGGTTTTCCCTCACAGGCTGCTGGTGTTATCAGAT 
CAATTTTTGCGCGCACACTTGATGCAGCAAACCACTCAATTCCTGATTTG 
CAAAGAGCAGCTGTCACCATACTTGATGGTATTTCTGAACAGTCATTACG 
TCTTGTCGACGCCATGGTTTATACTTCAGACCTGCTCACCAACAGTGTCA 
TTATTATGGCATATGTAACTGGTGGTCTTGTACAACAGACTTCTCAGTGG 
TTGTCTAATCTTTTGGGCACTACTGTTGAAAAACTCAGGCCTATCTTTGA 
ATGGATTGAGGCGAAACTTAGTGCAGGAGTTGAATTTCTCAAGGATGCTT 
GGGAGATTCTCAAATTTCTCATTACAGGTGTTTTTGACATCGTCAAGGGT 
CAAATACAGGTTGCTTCAGATAACATCAAGGATTGTGTAAAATGCTTCAT 
TGATGTTGTTAACAAGGCACTCGAAATGTGCATTGATCAAGTCACTATCG 
CTGGCGCAAAGTTGCGATCACTCAACTTAGGTGAAGTCTTCATCGCTCAA 
AGCAAGGGACTTTACCGTCAGTGTATACGTGGCAAGGAGCAGCTGCAACT 
ACTCATGCCTCTTAAGGCACCAAAAGAAGTAACCTTTCTTGAAGGTGATT 
CACATGACACAGTACTTACCTCTGAGGAGGTTGTTCTCAAGAACGGTGAA 
CTCGAAGCACTCGAGACGCCCGTTGATAGCTTCACAAATGGAGCTATCGT 
TGGCACACCAGTCTGTGTAAATGGCCTCATGCTCTTAGAGATTAAGGACA 
AAGAACAATACTGCGCATTGTCTCCTGGTTTACTGGCTACAAACAATGTC 
TTTCGCTTAAAAGGGGGTGCACCAATTAAAGGTGTAACCTTTGGAGAAGA 
TACTGTTTGGGAAGTTCAAGGTTACAAGAATGTGAGAATCACATTTGAGC 
TTGATGAACGTGTTGACAAAGTGCTTAATGAAAAGTGCTCTGTCTACACT 
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GTTGAATCCGGTACCGAAGTTACTGAGTTTGCATGTGTTGTAGCAGAGGC 
TGTTGTGAAGACTTTACAACCAGTTTCTGATCTCCTTACCAACATGGGTA 
TTGATCTTGATGAGTGGAGTGTAGCTACATTCTACTTATTTGATGATGCT 
GGTGAAGAAAACTTTTCATCACGTATGTATTGTTCCTTTTACCCTCCAGA 
TGAGGAAGAAGAGGACGATGCAGAGTGTGAGGAAGAAGAAATTGATGAAA 
CCTGTGAACATGAGTACGGTACAGAGGATGATTATCAAGGTCTCCCTCTG 
GAATTTGGTGCCTCAGCTGAAACAGTTCGAGTTGAGGAAGAAGAAGAGGA 
AGACTGGCTGGATGATACTACTGAGCAATCAGAGATTGAGCCAGAACCAG 
AACCTACACCTGAAGAACCAGTTAATCAGTTTACTGGTTATTTAAAACTT 
ACTGACAATGTTGCCATTAAATGTGTTGACATCGTTAAGGAGGCACAAAG 
TGCTAATCCTATGGTGATTGTAAATGCTGCTAACATACACCTGAAACATG 
GTGGTGGTGTAGCAGGTGCACTCAACAAGGCAACCAATGGTGCCATGCAA 
AAGGAGAGTGATGATTACATTAAGCTAAATGGCCCTCTTACAGTAGGAGG 
GTCTTGTTTGCTTTCTGGACATAATCTTGCTAAGAAGTGTCTGCATGTTG 
TTGGACCTAACCTAAATGCAGGTGAGGACATCCAGCTTCTTAAGGCAGCA 
TATGAAAATTTCAATTCACAGGACATCTTACTTGCACCATTGTTGTCAGC 
AGGCATATTTGGTGCTAAACCACTTCAGTCTTTACAAGTGTGCGTGCAGA 
CGGTTCGTACACAGGTTTATATTGCAGTCAATGACAAAGCTCTTTATGAG 
CAGGTTGTCATGGATTATCTTGATAACCTGAAGCCTAGAGTGGAAGCACC 
TAAACAAGAGGAGCCACCAAACACAGAAGATTCCAAAACTGAGGAGAAAT 
CTGTCGTACAGAAGCCTGTCGATGTGAAGCCAAAAATTAAGGCCTGCATT 
GATGAGGTTACCACAACACTGGAAGAAACTAAGTTTCTTACCAATAAGTT 
ACTCTTGTTTGCTGATATCAATGGTAAGCTTTACCATGATTCTCAGAACA 
TGCTTAGAGGTGAAGATATGTCTTTCCTTGAGAAGGATGCACCTTACATG 
GTAGGTGATGTTATCACTAQTGGTGATATCACTTGTGTTGTAATACCCTC 
CAAAAAGGCTGGTGGCACTACTGAGATGCTCTCAAGAGCTTTGAAGAAAG 
TGCCAGTTGATGAGTATATAACCACGTACCCTGGACAAGGATGTGCTGGT 
TATACACTTGAGGAAGCTAAGACTGCTCTTAAGAAATGCAAATCTGCATT 
TTATGTACTACCTTCAGAAGCACCTAATGCTAAGGAAGAGATTCTAGGAA 
CTGTATCCTGGAATTTGAGAGAAATGCTTGCTCATGCTGAAGAGACAAGA 
AAATTAATGCCTATATGCATGGATGTTAGAGCCATAATGGCAACCATCCA 
ACGTAAGTATAAAGGAATTAAAATTCAAGAGGGCATCGTTGACTATGGTG 
TCCGATTCTTCTTTTATACTAGTAAAGAGCCTGTAGCTTCTATTATTACG 
AAGCTGAACTCTCTAAATGAGCCGCTTGTCACAATGCCAATTGGTTATGT 
GACACATGGTTTTAATCTTGAAGAGGCTGCGCGCTGTATGCGTTCTCTTA 
AAGCTCCTGCCGTAGTGTCAGTATCATCACCAGATGCTGTTACTACATAT 
AATGGATACCTCACTTCGTCATCAAAGACATCTGAGGAGCACTTTGTAGA 
AACAGTTTCTTTGGCTGGCTCTTACAGAGATTGGTCCTATTCAGGACAGC 
GTACAGAGTTAGGTGTTGAATTTCTTAAGCGTGGTGACAAAATTGTGTAC 
CACACTCTGGAGAGCCCCGTCGAGTTTCATCTTGACGGTGAGGTTCTTTC 
ACTTGACAAACTAAAGAGTCTCTTATCCCTGCGGGAGGTTAAGACTATAA 
AAGTGTTCACAACTGTGGACAACACTAATCTCCACACACAGCTTGTGGAT 
ATGTCTATGACATATGGACAGCAGTTTGGTCCAACATACTTGGATGGTGC 
TGATGTTACAAAAATTAAACCTCATGTAAA.TCATGAGGGTAAGACTTTCT 
TTGTACTACCTAGTGATGACACACTACGTAGTGAAGCTTTCGAGTACTAC 
CATACTCTTGATGAGAGTTTTCTTGGTAGGTACATGTCTGCTTTAAACCA 
CACAAAGAAATGGAAATTTCCTCAAGTTGGTGGTTTAACTTCAATTAAAT 
GGGCTGATAACAATTGTTATTTGTCTAGTGTTTTATTAGCACTTCAACAG 
CTTGAAGTCAAATTCAATGCACCAGCACTTCAAGAGGCTTATTATAGAGC 
CCGTGCTGGTGATGCTGCTAACTTTTGTGCACTCATACTCGCTTACAGTA 
ATAAAACTGTTGGCGAGCTTGGTGATGTCAGAGAAACTATGACCCATCTT 
CTACAGCATGCTAATTTGGAATCTGCA2\AGCGAGTTCTTAATGTGGTGTG 
TAAACATTGTGGTCAGAAAACTACTACCTTAACGGGTGTAGAAGCTGTGA 
TGTATATGGGTACTCTATCTTATGATAATCTTAAGACAGGTGTTTCCATT 
CCATGTGTGTGTGGTCGTGATGCTACACAATATCTAGTACAACAAGAGTC 
TTCTTTTGTTATGATGTCTGCACCACCTGCTGAGTATAAATTACAGCAAG 
GTACATTCTTATGTGCGAATGAGTACACTGGTAACTATCAGTGTGGTCAT 
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TACACTCATATAACTGCTAAGGAGACCCTCTATCGTATTGACGGAGCTCA 
CCTTACAAAGATGTCAGAGTACAAAGGACCAGTGACTGATGTTTTCTACA 
AGGAAACATCTTACACTACAACCATCAAGCCTGTGTCGTATAAACTCGAT 
GGAGTTACTTACACAGAGATTGAACCAAAATTGGATGGGTATTATAAAAA 
GGATAATGCTTACTATACAGAGCAGCCTATAGACCTTGTACCAACTCAAC 
CATTACCAAATGCGAGTTTTGATAATTTCAAACTCACATGTTCTAACACA 
AAATTTGCTGATGATTTAAATCAAATGACAGGCTTCACAAAGCCAGCTTC 
ACGAGAGCTATCTGTCACATTCTTCCCAGACTTGAATGGCGATGTAGTGG 
CTATTGACTATAGACACTATTCAGCGAGTTTCAAGAAAGGTGCTAAATTA 
CTGCATAAGCCAATTGTTTGGCACATTAACCAGGCTACAACCAAGACAAC 
GTTCAAACCAAACACTTGGTGTTTACGTTGTCTTTGGAGTACAAAGCCAG 
TAGATACTTCAAATTCATTTGAAGTTCTGGCAGTAGAAGACACACAAGGA 
ATGGACAATCTTGCTTGTGAAAGTCAACAACCCACCTCTGAAGAAGTAGT 
GGAAAATCCTACCATACAGAAGGAAGTCATAGAGTGTGACGTGAAAACTA 
CCGAAGTTGTAGGCAATGTCATACTTAAACCATCAGATGAAGGTGTTAAA 
GTAACACAAGAGTTAGGTCATGAGGATCTTATGGCTGCTTATGTGGAAAA 
CACAAGCATTACCATTAAGAAACCTAATGAGCTTTCACTAGCCTTAGGTT 
TAAAAACAATTGCCACTCATGGTATTGCTGCAATTAATAGTGTTCCTTGG 
AGTAAAATTTTGGCTTATGTCAAACCATTCTTAGGACAAGCAGCAATTAC 
AACATCAAATTGCGCTAAGAGATTAGCACAACGTGTGTTTAACAATTATA 
TGCCTTATGTGTTTACATTATTGTTCCAATTGTGTACTTTTACTAAAAGT 
ACCAATTCTAGAATTAGAGCTTCACTACCTACAACTATTGCTAAAAATAG 
TGTTAAGAGTGTTGCTAAATTATGTTTGGATGCCGGCATTAATTATGTGA 
AGTCACCCAAATTTTCTAAATTGTTCACAATCGCTATGTGGCTATTGTTG 
TTAAGTATTTGCTTAGGTTCTCTAATCTGTGTAACTGCTGCTTTTGGTGT 
ACTCTTATCTAATTTTGGTGCTCCTTCTTATTGTAATGGCGTTAGAGAAT 
TGTATCTTAATTCGTCTAACGTTACTACTATGGATTTCTGTGAAGGTTCT 
TTTCCTTGCAGCATTTGTTTAAGTGGATTAGACTCCCTTGATTCTTATCC 
AGCTCTTGAAACCATTCAGGTGACGATTTCATCGTACAAGCTAGACTTGA 
CAATTTTAGGTCTGGCCGCTGAGTGGGTTTTGGCATATATGTTGTTCACA 
AAATTCTTTTATTTATTAGGTCTTTCAGCTATAATGCAGGTGTTCTTTGG 
CTATTTTGCTAGTCATTTCATCAGCAATTCTTGGCTCATGTGGTTTATCA 
TTAGTATTGTACAAATGGCACCCGTTTCTGCAATGGTTAGGATGTACATC 
TTCTTTGCTTCTTTCTACTACATATGGAAGAGCTATGTTCATATCATGGA 
TGGTTGCACCTCTTCGACTTGCATGATGTGCTATAAGCGCAATCGTGCCA 
CACGCGTTGAGTGTACAACTATTGTTAATGGCATGAAGAGATCTTTCTAT 
GTCTATGCAAATGGAGGCCGTGGCTTCTGCAAGACTCACAATTGGAATTG 
TCTCAATTGTGACACATTTTGCACTGGTAGTACATTCATTAGTGATGAAG 
TTGCTCGTGATTTGTCACTCCAGTTTAAAAGACCAATCAACCCTACTGAC 
CAGTCATCGTATATTGTTGATAGTGTTGCTGTGAAAAATGGCGCGCTTCA 
CCTCTACTTTGACAAGGCTGGTCAAAAGACCTATGAGAGACATCCGCTCT 
CCCATTTTGTCAATTTAGACAATTTGAGAGCTAACAACACTAAAGGTTCA 
CTGCCTATTAATGTCATAGTTTTTGATGGCAAGTCCAAATGCGACGAGTC 
TGCTTCTAAGTCTGCTTCTGTGTACTACAGTCAGCTGATGTGCCAACCTA 
TTCTGTTGCTTGACCAAGCTCTTGTATCAGACGTTGGAGATAGTACTGAA 
GTTTCCGTTAAGATGTTTGATGCTTATGTCGACACCTTTTCAGCAACTTT 
TAGTGTTCCTATGGAAAAACTTAAGGCACTTGTTGCTACAGCTCACAGCG 
AGTTAGCAAAGGGTGTAGCTTTAGATGGTGTCCTTTCTACATTCGTGTCA 
GCTGCCCGACAAGGTGTTGTTGATACCGATGTTGACACAAAGGATGTTAT 
TGAATGTCTCAAACTTTCACATCACTCTGACTTAGAAGTGACAGGTGACA 
GTTGTAACAATTTCATGCTCACCTATAATAAGGTTGAAAACATGACGCCC 
AGAGATCTTGGCGCATGTATTGACTGTAATGCAAGGCATATCAATGCCCA 
AGTAGCAAAAAGTCACAATGTTTCACTCATCTGGAATGTAAAAGACTACA 
TGTCTTTATCTGAACAGCTGCGTAAACAAATTCGTAGTGCTGCCAAGAAG 
AACAACATACCTTTTAGACTAACTTGTGCTACAACTAGACAGGTTGTCAA 
TGTCATAACTACTAAAATCTCACTCAAGGGTGGTAAGATTGTTAGTACTT 
GTTTTAAACTTATGCTTAAGGCCACATTATTGTGCGTTCTTGCTGCATTG 
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GTTTGTTATATCGTTATGCCAGTACATACATTGTCAATCCATGATGGTTA 
CACAAATGAAATCATTGGTTACAAAGCCATTCAGGATGGTGTCACTCGTG 
ACATCATTTCTACTGATGATTGTTTTGCAAATAAACATGCTGGTTTTGAC 
GCATGGTTTAGCCAGCGTGGTGGTTCATACAAAAATGACAAAAGCTGCCC 
TGTAGTAGCTGCTATCATTACAAGAGAGATTGGTTTCATAGTGCCTGGCT 
TACCGGGTACTGTGCTGAGAGCAATCAATGGTGACTTCTTGCATTTTCTA 
CCTCGTGTTTTTAGTGCTGTTGGCAACATTTGCTACACACCTTCCAAACT 
CATTGAGTATAGTGATTTTGCTACCTCTGCTTGCGTTCTTGCTGCTGAGT 
GTACAATTTTTAAGGATGCTATGGGCAAACCTGTGCCATATTGTTATGAC 
ACTAATTTGCTAGAGGGTTCTATTTCTTATAGTGAGCTTCGTCCAGACAC 
TCGTTATGTGCTTATGGATGGTTCCATCATACAGTTTCCTAACACTTACC 
TGGAGGGTTCTGTTAGAGTAGTAACAACTTTTGATGCTGAGTACTGTAGA 
CATGGTACATGCGAAAGGTCAGAAGTAGGTATTTGCCTATCTACCAGTGG 
TAGATGGGTTCTTAATAATGAGCATTACAGAGCTCTATCAGGAGTTTTCT 
GTGGTGTTGATGCGATGAATCTCATAGCTAACATCTTTACTCCTCTTGTG 
CAACCTGTGGGTGCTTTAGATGTGTCTGCTTCAGTAGTGGCTGGTGGTAT 
TATTGCCATATTGGTGACTTGTGCTGCCTACTACTTTATGAAATTCAGAC 
GTGTTTTTGGTGAGTACAACCATGTTGTTGCTGCTAATGCACTTTTGTTT 
TTGATGTCTTTCACTATACTCTGTCTGGTACCAGCTTACAGCTTTCTGCC 
GGGAGTCTACTCAGTCTTTTACTTGTACTTGACATTCTATTTCACCAATG 
ATGTTTCATTCTTGGCTCACCTTCAATGGTTTGCCATGTTTTCTCCTATT 
GTGCCTTTTTGGATAACAGCAATCTATGTATTCTGTATTTCTCTGAAGCA 
CTGCCATTGGTTCTTTAACAACTATCTTAGGAAAAGAGTCATGTTTAATG 
GAGTTACATTTAGTACCTTCGAGGAGGCTGCTTTGTGTACCTTTTTGCTC 
AACAAGGAAATGTACCTAAAATTGCGTAGCGAGACACTGTTGCCACTTAC 
ACAGTATAACAGGTATCTTGCTCTATATAACAAGTACAAGTATTTCAGTG 
GAGCCTTAGATACTACCAGCTATCGTGAAGCAGCTTGCTGCCACTTAGCA 
AAGGCTCTAAATGACTTTAGCAACTCAGGTGCTGATGTTCTCTACCAACC 
ACCACAGACATCAATCACTTCTGCTGTTCTGCAGAGTGGTTTTAGGAAAA 
TGGCATTCCCGTCAGGCAAAGTTGAAGGGTGCATGGTACAAGTAACCTGT 
GGAACTACAACTCTTAATGGATTGTGGTTGGATGACACAGTATACTGTCC 
AAGACATGTCATTTGCACAGCAGAAGACATGCTTAATCCTAACTATGAAG 
ATCTGCTCATTCGCAAATCCAACCATAGCTTTCTTGTTCAGGCTGGCAAT 
GTTCAACTTCGTGTTATTGGCCATTCTATGCAAAATTGTGTGCTTAGGCT 
TAAAGTTGATACTTCTAACCCTAAGACACCCAAGTATAAATTTGTCCGTA 
TCCAACCTGGTCAAACATTTTCAGTTCTAGCATGCTACAATGGTTCACCA 
TCTGGTGTTTATCAGTGTGCCATGAGACCTAATCATACCATTAAAGGTTC 
TTTCCTTAATGGATCATGTGGTAGTGTTGGTTTTAACATTGATTATGATT 
GCGTGTCTTTCTGCTATATGCATCATATGGAGCTTCCAACAGGAGTACAC 
GCTGGTACTGACTTAGAAGGTAAATTCTATGGTCCATTTGTTGACAGACA 
AACTGCACAGGCTGCAGGTACAGACACAACCATAACATTAAATGTTTTGG 
CATGGCTGTATGCTGCTGTTATCAATGGTGATAGGTGGTTTCTTAATAGA 
TTCACCACTACTTTGAATGACTTTAACCTTGTGGCAATGAAGTACAACTA 
TGAACCTTTGACACAAGATCATGTTGACATATTGGGACCTCTTTCTGCTC 
AAACAGGAATTGCCGTCTTAGATATGTGTGCTGCTTTGAAAGAGCTGCTG 
CAGAATGGTATGAATGGTCGTACTATCCTTGGTAGCACTATTTTAGAAGA 
TGAGTTTACACCATTTGATGTTGTTAGACAATGCTCTGGTGTTACCTTCC 
AAGGTAAGTTCAAGAAAATTGTTAAGGGCACTCATCATTGGATGCTTTTA 
ACTTTCTTGACATCACTATTGATTCTTGTTCAAAGTACACAGTGGTCACT 
GTTTTTCTTTGTTTACGAGAATGCTTTCTTGCCATTTACTCTTGGTATTA 
TGGCAATTGCTGCATGTGCTATGCTGCTTGTTAAGCATAAGCACGCATTC 
TTGTGCTTGTTTCTGTTACCTTCTCTTGCAACAGTTGCTTACTTTAATAT 
GGTCTACATGCCTGCTAGCTGGGTGATGCGTATCATGACATGGCTTGAAT 
TGGCTGACACTAGCTTGTCTGGTTATAGGCTTAAGGATTGTGTTATGTAT 
GCTTCAGCTTTAGTTTTGCTTATTCTCATGACAGCTCGCACTGTTTATGA 
TGATGCTGCTAGACGTGTTTGGACACTGATGAATGTCATTACACTTGTTT 
ACAAAGTCTACTATGGTAATGCTTTAGATCAAGCTATTTCCATGTGGGCC 
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TTAGTTATTTCTGTAACCTCTAACTATTCTGGTGTCGTTACGACTATCAT 
GTTTTTAGCTAGAGCTATAGTGTTTGTGTGTGTTGAGTATTACCCATTGT 
TATTTATTACTGGCAACACCTTACAGTGTATCATGCTTGTTTATTGTTTC 
TTAGGCTATTGTTGCTGCTGCTACTTTGGCCTTTTCTGTTTACTCAACCG 
TTACTTCAGGCTTACTCTTGGTGTTTATGACTACTTGGTCTCTACACAAG 
AATTTAGGTATATGAACTCCCAGGGGCTTTTGCCTCCTAAGAGTAGTATT 
GATGCTTTCAAGCTTAACATTAAGTTGTTGGGTATTGGAGGTAAACCATG 
TATCAAGGTTGCTACTGTACAGTCTAAAATGTCTGACGTAAAGTGCACAT 
CTGTGGTACTGCTCTCGGTTCTTCAACAACTTAGAGTAGAGTCATCTTCT 
AAATTGTGGGCACAATGTGTACAACTCCACAATGATATTCTTCTTGC2\AA 
AGACACAACTGAAGCTTTCGAGAAGATGGTTTCTCTTTTGTCTGTTTTGC 
TATCCATGCAGGGTGCTGTAGACATTAATAGGTTGTGCGAGGAAATGCTC 
GATAACCGTGCTACTCTTCAGGCTATTGCTTCAGAATTTAGTTCTTTACC 
ATCATATGCCGCTTATGCCACTGCCCAGGAGGCCTATGAGCAGGCTGTAG 
CTAATGGTGATTCTGAAGTCGTTCTCAAAAAGTTAAAGAAATCTTTGAAT 
GTGGCTAAATCTGAGTTTGACCGTGATGCTGCCATGCAACGCAAGTTGGA 
AAAGATGGCAGATCAGGCTATGACCCAAATGTACAAACAGGCAAGATCTG 
AGGACAAGAGGGCAAAAGTAACTAGTGCTATGCAAACAATGCTCTTCACT 
ATGCTTAGGAAGCTTGATAATGATGCACTTAACAACATTATCAACAATGC 
GCGTGATGGTTGTGTTCCACTCAACATCATACCATTGACTACAGCAGCCA 
AACTCATGGTTGTTGTCCCTGATTATGGTACCTACAAGAACACTTGTGAT 
GGTAACACCTTTACATATGCATCTGCACTCTGGGAAATCCAGCAAGTTGT 
TGATGCGGATAGCAAGATTGTTCAACTTAGTGAAATTAACATGGACAATT 
CACCAAATTTGGCTTGGCCTCTTATTGTTACAGCTCTAAGAGCCAACTCA 
GCTGTTAAACTACAGAATAATGAACTGAGTCCAGTAGCACTACGACAGAT 
GTCCTGTGCGGCTGGTACCACACAAACAGCTTGTACTGATGACAATGCAC 
TTGCCTACTATAACAATTCGAAGGGAGGTAGGTTTGTGCTGGCATTACTA 
TCAGACCACCAAGATCTCAAATGGGCTAGATTCCCTAAGAGTGATGGTAC 
AGGTACAATTTACACAGAACTGGAACCACCTTGTAGGTTTGTTACAGACA 
CACCAAAAGGGCCTAAAGTGAAATACTTGTACTTCATCAAAGGCTTAAAC 
AACCT2VAATAGAGGTATGGTGCTGGGCAGTTTAGCTGCTACAGTACGTCT 
TCAGGCTGGAAATGCTACAGAAGTACCTGCCAATTCAACTGTGCTTTCCT 
TCTGTGCTTTTGCAGTAGACCCTGCTAAAGCATATAAGGATTACCTAGCA 
AGTGGAGGACAACCAATCACCAACTGTGTGAAGATGTTGTGTACACACAC 
TGGTACAGGACAGGCAATTACTGTAACACCAGAAGCTAACATGGACCAAG 
AGTCCTTTGGTGGTGCTTCATGTTGTCTGTATTGTAGATGCCACATTGAC 
CATCCAAATCCTAAAGGATTCTGTGACTTGAAAGGTAAGTACGTCCAAAT 
ACCTACCACTTGTGCTAATGACCCAGTGGGTTTTACACTTAGAAACACAG 
TCTGTACCGTCTGCGGAATGTGGAAAGGTTATGGCTGTAGTTGTGACCAA 
CTCCGCGAACCCTTGATGCAGTCTGCGGATGCATCAACGTTTTTAAACGG 
GTTTGCGGTGTAAGTGCAGCCCGTCTTACACCGTGCGGCACAGGCACTAG 
TACTGATGTCGTCTACAGGGCTTTTGATATTTACAACGAAAAAGTTGCTG 
GTTTTGCAAAGTTCCTAAAAACTAATTGCTGTCGCTTCCAGGAGAAGGAT 
GAGGAAGGCAATTTATTAGACTCTTACTTTGTAGTTAAGAGGCATACTAT 
GTCTAACTACCAACATGAAGAGACTATTTATAACTTGGTTAAAGATTGTC 
CAGCGGTTGCTGTCCATGACTTTTTCAAGTTTAGAGTAGATGGTGACATG 
GTACCACATATATCACGTCAGCGTCTAACTAAATACACAATGGCTGATTT 
AGTCTATGCTCTACGTCATTTTGATGAGGGTAATTGTGATACATTAAAAG 
AAATACTCGTCACATACAATTGCTGTGATGATGATTATTTCAATAAGAAG 
GATTGGTATGACTTCGTAGAGAATCCTGACATCTTACGCGTATATGCTAA 
CTTAGGTGAGCGTGTACGCCAATCATTATTAAAGACTGTACAATTCTGCG 
ATGCTATGCGTGATGCAGGCATTGTAGGCGTACTGACATTAGATAATCAG 
GATCTTAATGGGAACTGGTACGATTTCGGTGATTTCGTACAAGTAGCACC 
AGGCTGCGGAGTTCCTATTGTGGATTCATATTACTCATTGCTGATGCCCA 
TCCTCACTTTGACTAGGGCATTGGCTGCTGAGTCCCATATGGATGCTGAT 
CTCGCAAAACCACTTATTAAGTGGGATTTGCTGAAATATGATTTTACGGA 
AGAGAGACTTTGTCTCTTCGACCGTTATTTTAAATATTGGGACCAGACAT 
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ACCATCCCAATTGTATTAACTGTTTGGATGATAGGTGTATCCTTCATTGT 
GCAAACTTTAATGTGTTATTTTCTACTGTGTTTCCACCTACAAGTTTTGG 
ACCACTAGTAAGAAAAATATTTGTAGATGGTGTTCCTTTTGTTGTTTCAA 
CTGGATACCATTTTCGTGAGTTAGGAGTCGTACATAATCAGGATGTAAAC 
TTACATAGCTCGCGTCTCAGTTTCAAGGAACTTTTAGTGTATGCTGCTGA 
TCCAGCTATGCATGCAGCTTCTGGCAATTTATTGCTAGATAAACGCACTA 
CATGCTTTTCAGTAGCTGCACTAACAAACAATGTTGCTTTTCAAACTGTC 
AAACCCGGTAATTTTAATAAAGACTTTTATGACTTTGCTGTGTCTAAAGG 
TTTCTTa?AAGGAAGGAAGTTCTGTTGAACTAAAACACTTCTTCTTTGCTC 
AGGATGGCAACGCTGCTATCAGTGATTATGACTATTATCGTTATAATCTG 
CCAACAATGTGTGATATCAGACAACTCCTATTCGTAGTTGAAGTTGTTGA 
TAAATACTTTGATTGTTACGATGGTGGCTGTATTAATGCCAACCAAGTAA 
TCGTTAACAATCTGGATAAATCAGCTGGTTTCCCATTTAATAAATGGGGT 
AAGGCTAGACTTTATTATGACTCAATGAGTTATGAGGATCAAGATGCACT 
TTTCGCGTATACTAAGCGTAATGTCATCCCTACTATAACTCAAATGAATC 
TTAAGTATGCCATTAGTGCAAAGAATAGAGCTCGCACCGTAGCTGGTGTC 
TCTATCTGTAGTACTATGACAAATAGACAGTTTCATCAGAAATTATTGAA 
GTCAATAGCCGCCACTAGAGGAGCTACTGTGGTAATTGGAACAAGCAAGT 
TTTACGGTGGCTGGCATAATATGTTAAAAACTGTTTACAGTGATGTAGAA 
ACTCCACACCTTATGGGTTGGGATTATCCAAAATGTGACAGAGCCATGCC 
TAACATGCTTAGGATAATGGCCTCTCTTGTTCTTGCTCGCAAACATAACA 
CTTGCTGTAACTTATCACACCGTTTCTACAGGTTAGCTAACGAGTGTGCG 
CAAGTATTAAGTGAGATGGTCATGTGTGGCGGCTCACTATATGTTAAACC 
AGGTGGAACATCATCCGGTGATGCTACAACTGCTTATGCTAATAGTGTCT 
TTAACATTTGTCAAGCTGTTACAGCCAATGTAAATGCACTTCTTTCAACT 
GATGGTAATAAGATAGCTGACAAGTATGTCCGCAATCTACAACACAGGCT 
CTATGAGTGTCTCTATAGAAATAGGGATGTTGATCATGAATTCGTGGATG 
AGTTTTACGCTTACCTGCGTAAACATTTCTCCATGATGATTCTTTCTGAT 
GATGCCGTTGTGTGCTATAACAGTAACTATGCGGCTCAAGGTTTAGTAGC 
TAGCATTAAGAACTTTAAGGCAGTTCTTTATTATCAAAATAATGTGTTCA 
TGTCTGAGGCAAAATGTTGGACTGAGACTGACCTTACTAAAGGACCTCAC 
GAATTTTGCTCACAGCATACAATGCTAGTTAAACAAGGAGATGATTACGT 
GTACCTGCCTTACCCAGATCCATCAAGAATATTAGGCGCAGGCTGTTTTG 
TCGATGATATTGTCAAAACAGATGGTACACTTATGATTGAAAGGTTCGTG 
TCACTGGCTATTGATGCTTACCCACTTACAAAACATCCTAATCAGGAGTA 
TGCTGATGTCTTTCACTTGTATTTACAATACATTAGAAAGTTACATGATG 
AGCTTACTGGCCACATGTTGGACATGTATTCCGTAATGCTAACTAATGAT 
AACACCTCACGGTACTGGGAACCTGAGTTTTATGAGGCTATGTACACACC 
ACATACAGTCTTGCAGGCTGTAGGTGCTTGTGTATTGTGCAATTCACAGA 
CTTCACTTCGTTGCGGTGCCTGTATTAGGAGACCATTCCTATGTTGCAAG 
TGCTGCTATGACCATGTCATTTCAACATCACACAAATTAGTGTTGTCTGT 
TAATCCCTATGTTTGCAATGCCCCAGGTTGTGATGTCACTGATGTGACAC 
AACTGTATCTAGGAGGTATGAGCTATTATTGCAAGTCACATAAGCCTCCC 
ATTAGTTTTCCATTATGTGCTAATGGTCAGGTTTTTGGTTTATACAAAAA 
CACATGTGTAGGCAGTGACAATGTCACTGACTTCAATGCGATAGCAACAT 
GTGATTGGACTAATGCTGGCGATTACATACTTGCCAACACTTGTACTGAG 
AGACa?CAAGCTTTTCGCAGCAGAAACGCTCAAAGCCACTGAGGAAACATT 
TAAGCTGTCATATGGTATTGCCACTGTACGCGAAGTACTCTCTGACAGAG 
AATTGCATCTTTCATGGGAGGTTGGAAAACCTAGACCACCATTGAACAGA 
AACTATGTCTTTACTGGTTACCGTGTAACTAAAAATAGTAAAGTACAGAT 
TGGAGAGTACACCTTTGAAAAAGGTGACTATGGTGATGCTGTTGTGTACA 
GAGGTACTACGACATACAAGTTGAATGTTGGTGATTACTTTGTGTTGACA 
TCTCACACTGTAATGCCACTTAGTGCACCTACTCTAGTGCCACAAGAGCA 
CTATGTGAGAATTACTGGCTTGTACCCAACACTCAACATCTCAGATGAGT 
TTTCTAGCAATGTTGCAAATTATCAAAAGGTCGGCATGCAAM.GTACTCT 
ACACTCCAAGGACCACCTGGTACTGGTAAGAGTCATTTTGCCATCGGACT 
TGCa?CTCTATTACCCATCTGCTCGCATAGTGTATACGGCATGCTCTCATG 



FIGURE IIF 



wo 2004/096842 



PCT/CA2004/000626 



34/55 

CAGCTGTTGATGCCCTATGTGAAAAGGCATTAAAATATTTGCCCATAGAT 
AAATGTAGTAGAATCATACCTGCGCGTGCGCGCGa?AGAGTGTTTTGATAA 
ATTCAAAGTGAATTCAACACTAGAACAGTATGTTTTCTGCACTGTAAATG 
CATTGCCAGAAACAACTGCTGACATTGTAGTCTTTGATGAAATCTCTATG 
GCTACTAATTATGACTTGAGTGTTGTCAATGCTAGACTTCGTGCAAAACA 
CTACGTCTATATTGGCGATCCTGCTCAATTACCAGCCCCCCGCACATTGC 
TGACTAAAGGCACACTAGAACCAGAATATTTTAATTCAGTGTGCAGACTT 
ATGAAAACAATAGGTCCAGACATGTTCCTTGGAACTTGTCGCCGTTGTCC 
TGCTGAAATTGTTGACACTGTGAGTGCTTTAGTTTATGACAATAAGCTAA 
AAGCACACAAGGATAAGTCAGCTCAATGCTTCAAAATGTTCTACAAAGGT 
GTTATTACACATGATGTTTCATCTGCAATCAACAGACCTCAAATAGGCGT 
TGTAAGAGAATTTCTTACACGCAATCCTGCTTGGAGAAAAGCTGTTTTTA 
TCTCACCTTATAATTCACAGAACGCTGTAGCTTCAAAAATCTTAGGATTG 
CCTACGCAGACTGTTGATTCATCACAGGGTTCTGAATATGACTATGTCAT 
ATTCACACAAACTACTGAAACAGCACACTCTTGTAATGTCAACCGCTTCA 
ATGTGGCTATCACAAGGGCAAAAATTGGCATTTTGTGCATAATGTCTGAT 
AGAGATCTTTATGACAAACTGCAATTTACAAGTCTAGAAATACCACGTCG 
CAATGTGGCTACATTACAAGCAGAAAATGTAACTGGACTTTTTAAGGACT 
GTAGTAAGATCATTACTGGTCTTCATCCTACACAGGCACCTACACACCTC 
AGCGTTGATATAAAGTTCAAGACTGAAGGATTATGTGTTGACATACCAGG 
CATACCAAAGGACATGACCTACCGTAGACTCATCTCTATGATGGGTTTCA 
AAATGAATTACCAAGTCAATGGTTACCCTAATATGTTTATCACCCGCGAA 
GAAGCTATTCGTCACGTTCGTGCGTGGATTGGCTTTGATGTAGAGGGCTG 
TCATGCAACTAGAGATGCTGTGGGTACTAACCTACCTCTCCAGCTAGGAT 
TTTCTACAGGTGTTAACTTAGTAGCTGTACCGACTGGTTATGTTGACACT 
GAAAATAACACAGAATTCACCAGAGTTAATGCAAAACCTCCACCAGGTGA 
CCAGTTTAAACATCTTATACCACTCATGTATAAAGGCTTGCCCTGGAATG 
TAGTGCGTATTAAGATAGTACAAATGCTCAGTGATACACTGAAAGGATTG 
TCAGACAGAGTCGTGTTCGTCCTTTGGGCGCATGGCTTTGAGCTTACATC 
AATGAAGTACTTTGTCAAGATTGGACCTGAAAGAACGTGTTGTCTGTGTG 
ACAAACGTGCAACTTGCTTTTCTACTTCATCAGATACTTATGCCTGCTGG 
AATCATTCTGTGGGTTTTGACTATGTCTATAACCCATTTATGATTGATGT 
TCAGCAGTGGGGCTTTACGGGTAACCTTCAGAGTAACCATGACCAACATT 
GCCAGGTACATGGAAATGCACATGTGGCTAGTTGTGATGCTATCATGACT 
AGATGTTTAGCAGTCCATGAGTGCTTTGTTAAGCGCGTTGATTGGTCTGT 
TGAATACCCTATTATAGGAGATGAACTGAGGGTTAATTCTGCTTGCAGAA 
AAGTACAACACATGGTTGTGAAGTCTGCATTGCTTGCTGATAAGTTTCCA 
GTTCTTCATGACATTGGAAATCCAAAGGCTATCAAGTGTGTGCCTCAGGC 
TGAAGTAGAATGGAAGTTCTACGATGCTCAGCCATGTAGTGACAAAGCTT 
ACAAAATAGAGGAACTCTTCTATTCTTATGCTACACATCACGATAAATTC 
ACTGATGGTGTTTGTTTGTTTTGGAATTGTAACGTTGATCGTTACCCAGC 
CAATGCAATTGTGTGTAGGTTTGACACAAGAGTCTTGTCAAACTTGAACT 
TACCAGGCTGTGATGGTGGTAGTTTGTATGTGAATAAGCATGCATTCCAC 
ACTCCAGCTTTCGATAAAAGTGCATTTACTAATTTAAAGCAATTGCCTTT 
CTTTTACTATTCTGATAGTCCTTGTGAGTCTCATGGCAAACAAGTAGTGT 
CGGATATTGATTATGTTCCACTCAAATCTGCTACGTGTATTACACGATGC 
AATTTAGGTGGTGCTGTTTGCAGACACCATGCAAATGAGTACCGACAGTA 
CTTGGATGCATATAATATGATGATTTCTGCTGGATTTAGCCTATGGATTT 
ACAAACAATTTGATACTTATAACCTGTGGAATACATTTACCAGGTTACAG 
AGTTTAGAAAATGTGGCTTATAATGTTGTTAATAAAGGACACTTTGATGG 
ACACGCCGGCGAAGCACCTGTTTCCATCATTAATAATGCTGTTTACACAA 
AGGTAGATGGTATTGATGTGGAGATCTTTGAAAATAAGACAACACTTCCT 
GTTAATGTTGCATTTGAGCTTTGGGCTAAGCGTAACATTAAACCAGTGCC 
AGAGATTAAGATACTCAATAATTTGGGTGTTGATATCGCTGCTAATACTG 
TAATCTGGGACTACAAAAGAGAAGCCCCAGCACATGTATCTACAATAGGT 
GTCTGCACAATGACTGACATTGCCAAGAAACCTACTGAGAGTGCTTGTTC 
TTCACTTACTGTCTTGTTTGATGGTAGAGTGGAAGGACAGGTAGACCTTT 
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TTAGAAACGCCCGTAATGGTGTTTTAATAACAGAAGGTTCAGTCAAAGGT 
CTAACACCTTCAAAGGGACCAGCACAAGCTAGCGTCAATGGAGTCACATT 
AATTGGAGAATCAGTAAAAACACAGTTTAACTACTTTAAGAAAGTAGACG 
GCATTATTCAACAGTTGCCTGAAACCTACTTTACTCAGAGCAGAGACTTA 
GAGGATTTTAAGCCCAGATCACAAATGGAAACTGACTTTCTCGAGCTCGC 
TATGGATGAATTCATACAGCGATATAAGCTCGAGGGCTATGCCTTCGAAC 
ACATCGTTTATGGAGATTTCAGTCATGGACAACTTGGCGGTCTTCATTTA 
ATGATAGGCTTAGCCAAGCGCTCACAAGATTCACCACTTAAATTAGAGGA 
TTTTATCCCTATGGACAGCACAGTGAAAAATTACTTCATAACAGATGCGC 
AAACAGGTTCATCAAAATGTGTGTGTTCTGTGATTGATCTTTTACTTGAT 
GACTTTGTCGAGATAATAAAGTCACAAGATTTGTCAGTGATTTCAAAAGT 
GGTCAAGGTTACAATTGACTATGCTGAAATTTCATTCATGCTTTGGTGTA 
AGGATGGACATGTTGAAACCTTCTACCCAAAACTACAAGCAAGTCAAGCG 
TGGCAACCAGGTGTTGCGATGCCTAACTTGTACAAGATGCAAAGAATGCT 
TCTTGAAAAGTGTGACCTTCAGAATTATGGTGAAAATGCTGTTATACCAA 
AAGGAATAATGATGAATGTCGCAAAGTATACTCAACTGTGTCAATACTTA 
AATACACTTACTTTAGCTGTACCCTACAACATGAGAGTTATTCACTTTGG 
TGCTGGCTCTGATAAAGGAGTTGCACCAGGTACAGCTGTGCTCAGACAAT 
GGTTGCCAACTGGCACACTACTTGTCGATTCAGATCTTAATGACTTCGTC 
TCCGACGCAGATTCTACTTTAATTGGAGACTGTGCAACAGTACATACGGC 
TAATAAATGGGACCTTATTATTAGCGATATGTATGACCCTAGGACCAAAC 
ATGTGACAAAAGAGAATGACTCTAAAGAAGGGTTTTTCACTTATCTGTGT 
GGATTTATAAAGCAAAAACTAGCCCTGGGTGGTTCTATAGCTGTAAAGAT 
AACAGAGCATTCTTGGAATGCTGACCTTTACAAGCTTATGGGCCATTTCT 
CATGGTGGACAGCTTTTGTTACAAATGTAAATGCATCATCATCGGAAGCA 
TTTTTAATTGGGGCTAACTATCTTGGCAAGCCGAAGGAACAAATTGATGG 
CTATACCATGCATGCTAACTACATTTTCTGGAGGAACACAAATCCTATCC 
AGTTGTCTTCCTATTCACTCTTTGACATGAGCAAATTTCCTCTTAAATTA 
AGAGGAACTGCTGTAATGTCTCTTAAGGAGAATCAAATCAATGATATGAT 
TTATTCTCTTCTGGAAAAAGGTAGGCTTATCATTAGAGAAAACAACAGAG 
TTGTGGTTTCAAGTGATATTCTTGTTAACAACTAAACGAACATGTTTATT 
TTCTTATTATTTCTTACTCTCACTAGTGGTAGTGACCTTGACCGGTGCAC 
CACTTTTGATGATGTTCAAGCTCCTAATTACACTCAACATACTTCATCTA 
TGAGGGGGGTTTACTATCCTGATGAAATTTTTAGATCAGACACTCTTTAT 
TTAACTCAGGATTTATTTCTTCCATTTTATTCTAATGTTACAGGGTTTCA 
TACTATTAATCATACGTTTGGCAACCCTGTCATACCTTTTAAGGATGGTA 
TTTATTTTGCTGCCACAGAGAAATCAAATGTTGTCCGTGGTTGGGTTTTT 
GGTTCTACCATGAACAACAAGTCACAGTCGGTGATTATTATTAACAATTC 
TACTAATGTTGTTATACGAGCATGTAACTTTGAATTGTGTGACAACCCTT 
TCTTTGCTGTTTCTAAACCCATGGGTACACAGACACATACTATGATATTC 
GATAATGCATTTAATTGCACTTTCGAGTACATATCTGATGCCTTTTCGCT 
TGATGTTTCAGAAAAGTCAGGTAATTTTAAACACTTACGAGAGTTTGTGT 
TTAAAAATAAAGATGGGTTTCTCTATGTTTATAAGGGCTATCAACCTATA 
GATGTAGTTCGTGATCTACCTTCTGGTTTTAACACTTTGAAACCTATTTT 
TAAGTTGCCTCTTGGTATTAACATTACAAATTTTAGAGCCATTCTTACAG 
CCTTTTCACCTGCTCAAGACATTTGGGGCACGTCAGCTGCAGCCTATTTT 
GTTGGCTATTTAAAGCCAACTACATTTATGCTCAAGTATGATGAAAATGG 
TACAATCACAGATGCTGTTGATTGTTCTCAAAATCCACTTGCTGAACTCA 
AATGCTCTGTTAAGAGCTTTGAGATTGACAAAGGAATTTACCAGACCTCT 
AATTTCAGGGTTGTTCCCTCAGGAGATGTTGTGAGATTCCCTAATATTAC 
AAACTTGTGTCCTTTTGGAGAGGTTTTTAATGCTACTAAATTCCCTTCTG 
TCTATGCATGGGAGAGAAAAAAAATTTCTAATTGTGTTGCTGATTACTCT 
GTGCTCTACAACTCAACATTTTTTTCAACCTTTAAGTGCTATGGCGTTTC 
TGCCACTAAGTTGAATGATCTTTGCTTCTCCAATGTCTATGCAGATTCTT 
TTGTAGTCAAGGGAGATGATGTAAGACAAATAGCGCCAGGACAAACTGGT 
GTTATTGCTGATTATAATTATAAATTGCCAGATGATTTCATGGGTTGTGT 
CCTTGCTTGGAATACTAGGAACATTGATGCTACTTCAACTGGTAATTATA 
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ATTATAAATATAGGTATCTTAGACATGGCAAGCTTAGGCCCTTTGAGAGA 
GACATATCTAATGTGCCTTTCTCCCCTGATGGCAAACCTTGCACCCCACC 
TGCTCTTAATTGTTATTGGCCATTAAATGATTATGGTTTTTACACCACTA 
CTGGCATTGGCTACCAACCTTACAGAGTTGTAGTACTTTCTTTTGAACTT 
TTAAATGCACCGGCCACGGTTTGTGGACCAAAATTATCCACTGACCTTAT 
TAAGAACCAGTGTGTCAATTTTAATTTTAATGGACTCACTGGTACTGGTG 
TGTTAACTCCTTCTTCAAAGAGATTTCAACCATTTCAACAATTTGGCCGT 
GATGTTTCTGATTTCACTGATTCCGTTCGAGATCCTAAAACATCTGAAAT 
ATTAGACATTTCACCTTGCGCTTTTGGGGGTGTAAGTGTAATTACACCTG 
GAACAAATGCTTCATCTGAAGTTGCTGTTCTATATCAAGATGTTAACTGC 
ACTGATGTTTCTACAGCAATTCATGCAGATCAACTCACACCAGCTTGGCG 
CATATATTCTACTGGAAACAATGTATTCCAGACTCAAGCAGGCTGTCTTA 
TAGGAGCTGAGCATGTCGACACTTCTTATGAGTGCGACATTCCTATTGGA 
GCTGGCATTTGTGCTAGTTACCATACAGTTTCTTTATTACGTAGTACTAG 
CCAAAAATCTATTGTGGCTTATACTATGTCTTTAGGTGCTGATAGTTCAA 
TTGCTTACTCTAATAACACCATTGCTATACCTACTAACTTTTCAATTAGG 
ATTACTACAGAAGTAATGCCTGTTTCTATGGCTAAAACCTCCGTAGATTG 
TAATATGTACATCTGCGGAGATTCTACTGAATGTGCTAATTTGCTTCTCC 
AATATGGTAGCTTTTGCACACAACTAAATCGTGCACTCTCAGGTATTGCT 
GCTGAACAGGATCGCAACACACGTGAAGTGTTCGCTCAAGTCAAACAAAT 
GTACAAAACCCCAACTTTGAAATATTTTGGTGGTTTTAATTTTTCACAAA 
TATTACCTGACCCTCTAAAGCCAACTAAGAGGTCTTTTATTGAGGACTTG 
CTCTTTAATAAGGTGACACTCGCTGATGCTGGCTTCATGAAGCAATATGG 
CGAATGCCTAGGTGATATTAATGCTAGAGATCTCATTTGTGCGCAGAAGT 
TCAATGGACTTACAGTGTTGCCACCTCTGCTCACTGATGATATGATTGCT 
GCCTACACTGCTGCTCTAGTTAGTGGTACTGCCACTGCTGGATGGACATT 
TGGTGCTGGCGCTGCTCTTCAAATACCTTTTGCTATGCAAATGGCATATA 
GGTTCAATGGCATTGGAGTTACCCAAAATGTTCTCTATGAGAACCAAAAA 
CAAATCGCCAACCAATTTAACAAGGCGATTAGTCAAATTCAAGAATCACT 
TACAACAACATCAACTGCATTGGGCAAGCTGCAAGACGTTGTTAACCAGA 
ATGCTCAAGCATTAAACACACTTGTTAAACAACTTAGCTCTAATTTTGGT 
GCAATTTCAAGTGTGCTAAATGATATCCTTTCGCGACTTGATAAAGTCGA 
GGCGGAGGTACAAATTGACAGGTTAATTACAGGCAGACTTCAAAGCCTTC 
AAACCTATGTAACACAACAACTAATCAGGGCTGCTGAAATCAGGGCTTCT 
GCTAATCTTGCTGCTACTAAAATGTCTGAGTGTGTTCTTGGACAATCAAA 
AAGAGTTGACTTTTGTGGAAAGGGCTACCACCTTATGTCCTTCCCACAAG 
CAGCCCCGCATGGTGTTGTCTTCCTACATGTCACGTATGTGCCATCCCAG 
GAGAGGAACTTCACCACAGCGCCAGCAATTTGTCATGAAGGCAAAGCATA 
CTTCCCTCGTGAAGGTGTTTTTGTGTTTAATGGCACTTCTTGGTTTATTA 
CACAGAGGAACTTCTTTTCTCCACAAATAATTACTACAGACAATACATTT 
GTCTCAGGAAATTGTGATGTCGTTATTGGCATCATTAACAACACAGTTTA 
TGATCCTCTGCAACCTGAGCTTGACTCATTCAAAGAAGAGCTGGACAAGT 
ACTTCAAAAATCATACATCACCAGATGTTGATCTTGGCGACATTTCAGGC 
ATTAACGCTTCTGTCGTCAACATTCAAAAAGAAATTGACCGCCTCAATGA 
GGTCGCTAAAAATTTAAATGAATCACTCATTGACCTTCAAGAATTGGGAA 
AATATGAGCAATATATTAAATGGCCTTGGTATGTTTGGCTCGGCTTCATT 
GCTGGACTAATTGCCATCGTCATGGTTACAATCTTGCTTTGTTGCATGAC 
TAGTTGTTGCAGTTGCCTCAAGGGTGCATGCTCTTGTGGTTCTTGCTGCA 
AGTTTGATGAGGATGACTCTGAGCCAGTTCTCAAGGGTGTCAAATTACAT 
TACACATAAACGAACTTATGGATTTGTTTATGAGATTTTTTACTCTTAGA 
TCAATTACTGCACAGCCAGTAAAAATTGACAATGCTTCTCCTGCAAGTAC 
TGTTCATGCTACAGCAACGATACCGCTACAAGCCTCACTCCCTTTCGGAT 
GGCTTGTTATTGGCGTTGCATTTCTTGCTGTTTTTCAGAGCGCTACCAAA 
ATAATTGCGCTCAATAAAAGATGGCAGCTAGCCCTTTATAAGGGCTTCCA 
GTTCATTTGCAATTTACTGCTGCTATTTGTTACCATCTATTCACATCTTT 
TGCTTGTCGCTGCAGGTATGGAGGCGCAATTTTTGTACCTCTATGCCTTG 
ATATATTTTCTACAATGCATCAACGCATGTAGAATTATTATGAGATGTTG 
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GCTTTGTTGGAAGTGCAAATCCAAGAACCCATTACTTTATGATGCCAACT 
ACTTTGTTTGCTGGCACACACATAACTATGACTACTGTATACCATATAAC 
AGTGTCACAGATACAATTGTCGTTACTGAAGGTGACGGCATTTCAACACC 
AAAACTCAAAGAAGACTACCAAATTGGTGGTTATTCTGAGGATAGGCACT 
CAGGTGTTAAAGACTATGTCGTTGTACATGGCTATTTCACCGAAGTTTAC 
TACCAGCTTGAGTCTACACAAATTACTACAGACACTGGTATTGAAAATGC 
TACATTCTTCATCTTTAACAAGCTTGTTAAAGACCCACCGAATGTGCAAA 
TACACACAATCGACGGCTCTTCAGGAGTTGCTAATCCAGCAATGGATCCA 
ATTTATGATGAGCCGACGACGACTACTAGCGTGCCTTTGTAAGCACAAGA 
AAGTGAGTACGAACTTATGTACTCATTCGTTTCGGAAGAAACAGGTACGT 
TAATAGTTAATAGCGTACTTCTTTTTCTTGCTTTCGTGGTATTCTTGCTA 
GTCACACTAGCCATCCTTACTGCGCTTCGATTGTGTGCGTACTGCTGCAA 
TATTGTTAACGTGAGTTTAGTAAAACCAACGGTTTACGTCTACTCGCGTG 
TTAAAAATCTGAACTCTTCTGAAGGAGTTCCTGATCTTCTGGTCTAAACG 
AACTAACTATTATTATTATTCTGTTTGGAACTTTAACATTGCTTATCATG 
GCAGACAACGGTACTATTACCGTTGAGGAGCTTAAACAACTCCTGGAACA 
ATGGAACCTAGTAATAGGTTTCCTATTCCTAGCCTGGATTATGTTACTAC 
AATTTGCCTATTCTAATCGGAACAGGTTTTTGTACATAATAAAGCTTGTT 
TTCCTCTGGCTCTTGTGGCCAGTAACACTTGCTTGTTTTGTGCTTGCTGC 
TGTCTACAGAATTAATTGGGTGACTGGCGGGATTGCGATTGCAATGGCTT 
GTATTGTAGGCTTGATGTGGCTTAGCTACTTCGTTGCTTCCTTCAGGCTG 
TTTGCTCGTACCCGCTCAATGTGGTCATTCAACCCAGAAACAAACATTCT 
TCTCAATGTGCCTCTCCGGGGGACAATTGTGACCAGACCGCTCATGGAAA 
GTGAACTTGTCATTGGTGCTGTGATCATTCGTGGTCACTTGCGAATGGCC 
GGACACTCCCTAGGGCGCTGTGACATTAAGGACCTGCCAAAAGAGATCAC 
TGTGGCTACATCACGAACGCTTTCTTATTACAAATTAGGAGCGTCGCAGC 
GTGTAGGCACTGATTCAGGTTTTGCTGCATACAACCGCTACCGTATTGGA 
AACTATAAATTAAATACAGACCACGCCGGTAGCAACGACAATATTGCTTT 
GCTAGTACAGTAAGTGACAACAGATGTTTCATCTTGTTGACTTCCAGGTT 
ACAATAGCAGAGATATTGATTATCATTATGAGGACTTTCAGGATTGCTAT 
TTGGAATCTTGACGTTATAATAAGTTCAATAGTGAGACAATTATTTAAGC 
CTCTAACTAAGAAGAATTATTCGGAGTTAGATGATGAAGAACCTATGGAG 
TTAGATTATCCATA2NAACGAACATGAAAATTATTCTCTTCCTGACATTGA 
TTGTATTTACATCTTGCGAGCTATATCACTATCAGGAGTGTGTTAGAGGT 
ACGACTGTACTACTAAAAGAACCTTGCCCATCAGGAACATACGAGGGCAA 
TTCACCATTTCACCCTCTTGCTGACAATAAATTTGCACTAACTTGCACTA 
GCACACACTTTGCTTTTGCTTGTGCTGACGGTACTCGACATACCTATCAG 
CTGCGTGCAAGATCAGTTTCACCAAAACTTTTCATCAGACAAGAGGAGGT 
TCAACAAGAGCTCTACTCGCCACTTTTTCTCATTGTTGCTGCTCTAGTAT 
TTTTAATACTTTGCTTCACCATTAAGAGAAAGACAGAATGAATGAGCTCA 
CTTTAATTGACTTCTATTTGTGCTTTTTAGCCTTTCTGCTATTCCTTGTT 
TTAATAATGCTTATTATATTTTGGTTTTCACTCGAAATCCAGGATCTAGA 
AGAACCTTGTACCAAAGTCTAAACGAACATGAAACTTCTCATTGTTTTGA 
CTTGTATTTCTCTATGCAGTTGCATATGCACTGTAGTACAGCGCTGTGCA 
TCTAATAAACCTCATGTGCTTGAAGATCCTTGTAAGGTACAACACTAGGG 
GTAATACTTATAGCACTGCTTGGCTTTGTGCTCTAGGAAAGGTTTTACCT 
TTTCATAGATGGCACACTATGGTTCAAACATGCACACCTAATGTTACTAT 
CAACTGTCAAGATCCAGCTGGTGGTGCGCTTATAGCTAGGTGTTGGTACC 
TTCATGAAGGTCACCAAACTGCTGCATTTAGAGACGTACTTGTTGTTTTA 
AATAAACGAACAAATTAAAATGTCTGATAATGGACCCCAATCAAACCAAC 
GTAGTGCCCCCCGCATTACATTTGGTGGACCCACAGATTCAACTGACAAT 
AACCAGAATGGAGGACGCAATGGGGCAAGGCCAAAACAGCGCCGACCCCA 
AGGTTTACCCAATAATACTGCGTCTTGGTTCACAGCTCTCACTCAGCATG 
GCAAGGAGGAACTTAGATTCCCTCGAGGCCAGGGCGTTCCAATCAACACC 
AATAGTGGTCCAGATGACCAAATTGGCTACTACCGAAGAGCTACCCGACG 
AGTTCGTGGTGGTGACGGCAAAATGAAAGAGCTCAGCCCCAGATGGTACT 
TCTATTACCTAGGAACTGGCCCAGAAGCTTCACTTCCCTACGGCGCTAAC 



FIGURE llj 



wo 2004/096842 



PCT/CA2004/000626 



38/55 

AAAGAAGGCATCGTATGGGTTGCAACTGAGGGAGCCTTGAATACACCCAA 
AGACCACATTGGCACCCGCAATCCTAATAACAATGCTGCCACCGTGCTAC 
AACTTCCTCAAGGAACAACATTGCCAAAAGGCTTCTACGCAGAGGGAAGC 
AGAGGCGGCAGTCAAGCCTCTTCTCGCTCCTCATCACGTAGTCGCGGTAA 
TTCAAGAAATTCAACTCCTGGCAGCAGTAGGGGAAATTCTCCTGCTCGAA 
TGGCTAGCGGAGGTGGTGAAACTGCCCTCGCGCTATTGCTGCTAGACAGA 
TTGAACCAGCTTGAGAGCAAAGTTTCTGGTAAAGGCCAACAACAACAAGG 
CCAAACTGTCACTAAGAAATCTGCTGCTGAGGCATCTAAAAAGCCTCGCC 
AAAAACGTACTGCCACAAAACAGTACAACGTCACTCAAGCATTTGGGAGA 
CGTGGTCCAGAACAAACCCAAGGAAATTTCGGGGACCAAGACGTAATCAG 
ACAAGGAACTGATTACAAACATTGGCCGCAAATTGCACAATTTGCTCCAA 
GTGCCTCTGCATTCTTTGGAATGTCACGCATTGGCATGGAAGTCACACCT 
TCGGGAACATGGCTGACTTATCATGGAGCCATTAAATTGGATGACAAAGA 
TCCACAATTCAAAGACAACGTCATACTGCTGAACAAGCACATTGACGCAT 
ACAAAACATTCCCACCAACAGAGCCTAAAAAGGACAAAAAGAAAAAGACT 
GATGAAGCTCAGCCTTTGCCGCAGAGACAAAAGAAGCAGCCCACTGTGAC 
TCTTCTTCCTGCGGCTGACATGGATGATTTCTCCAGACAACTTCAAAATT 
CCATGAGTGGAGCTTCTGCTGATTCAACTCAGGCATAAACACTCATGATG 
ACCACACAAGGCAGATGGGCTATGTAAACGTTTTCGCAATTCCGTTTACG 
ATACATAGTCTACTCTTGTGCAGAATGAATTCTCGTAACTAAACAGCACA 
AGTAGGTTTAGTTAACTTTAATCTCACATAGCAATCTTTAATCAATGTGT 
AACATTAGGGAGGACTTGAAAGAGCCACCACATTTTCATCGAGGCCACGC 
GGAGTACGATCGAGGGTACAGTGAATAATGCTAGGGAGAGCTGCCTATAT 
GGAAGAGCCCTAATGTGTAAAATTAATTTTAGTAGTGCTATCCCCATGTG 
ATTTTAATAGCTTCTTAGGAGAATGACAAAAAAAAAAAAAAAAAAAAAAA A 

GenBank Accession No. AY274119.3, SEQ ID NO: 15 
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Membrane Glycoprotein 
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Nucleocapsid 
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S (Spike) Glyco protein 
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MRSLIYFWLLIiPVLPTLSLPQDVTRCQSTT- 

MlVIilLCLLLFSYNSVICTSNNDCVQGNVTQLPGNE- 



-NFRRFFSKFNVQAPA 
■^NIIKDFLFHTFKEEP 



-MIFIILTLLSVAKSEDAPHGVTIiPQFNTSHNNERFELNFYNFLQTWDIPPNT 

MFLILLISLPMA 

, MFLILLISLPMA 

MFFILLISLPSA 

MLFVFILLLPSC 
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VWLGGYLP 

SWVGGYYPTE- 



— SMNSSSWYCGTGIETASGVHGIFLSYIDSGQGFE 
-VWYNCSRSATTTAYKDFSNIHAFYFDMEAMElSrSTG 



ETILGGYLPYCGAGVNCGWYNFSQSVGQNGKYAYXNTQNLNIPNVHGVYFDVREHNNDGE 

FAVIGDLKCT TVSINDVDTGAPSISTDIVDVTNGLG 

LAVIGDLKCT TVAINDVDTGVPSTSTDIVDVTNGLG 

FAVIGDLKCT TSLINDVDTGVPS I SSEWDVTNGLG 

LGYIGDFRC IQ TVNYNGNNAS AP S 1 STE AVDVSKGRG 

-MFIPLLFLTLTSGSDLDRCTTFDDVQAPNYTQ 
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IG 1 SQEPFDPSGYQLYLHKATNGNTNATARLR — ICQFPDNKTLGPTVNDVTTG - 

NARGKPLLVHVHGDPVS 1 1 1 YI S AYRDDVQPRPLLKHGLLCITKNKI IDYNTFTS AQWS - 

WDDRDKVGLLIAIHGNSKYSLLMVLQDAVEAWQPHVAVKICHIrtKPGNISSYHAPSVli^ 

TY YVLDRVYDOTTLLLNGYYPTSGSTYRNMALKGTLLLSRLWFKPPFLSDFING-- 

TY YVLDRVYLNTTLLLNGYYPTSGSTYRNMAIiKGTLLLSRLWFKPPFLSDFlNG- 

TF YVLDRVYLm'TLLLNGYYPISGATFRNMALKGTRLLSTLWFKPPFLSPF]^^ 

TY YVLDRVYLNATLLLTGYYPVDGSNYRNLALTGTNTLSLTWPKPPFLSEFNDG- 

HT SSMRGVYYPDEIFRSDTLYLTQDLPIiPPYSNVTGFHTINHTFGNPVIPFKDG- 
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-RNCLFNKAIP AYMRDGKDIWGITWDNDRVT-VFADKIYHFYDKNDWSR 

-AICLGDDRKIPFSVIPTDNGTKIFGLEWNDDYVTAYISDRSHHLNINNNVn^^ 

GGQCVFNQRFS LDTVLTTNDFYGFQWTDTYVDIYLGGTITKVlafVDNDWSIVEAS 

IFAKVKN TKVIKKGVMYSEFPAITIGSTFVNTSYSVWQPHTTN 

IFAKVKN TKVIKHGVMYSEFPAITIGSTFVNTSYSVWQPHTTN 

IFAKVKN SRFSKDGVIYSEFPAITIGSTFVNTSYSIWEPHTSL 

IFAKVQN LKTNTPTGATSYFPTIVIGSLFGNTSYTWLEPYNN 

1 YFAATEK Sim/RGWVFGSTMNNKSQS VI IINNSTNWIRACNFELCDN 

MLGKSLFLVTILCALCSANLPDPANYVYYYQSAFRP 
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MFVLLVAYALLHIAGCQTTNGLN— TSYSVCNG CVGYSENVFAVES 

- - VATRC YWRRSCAMQYVYTPTYYMLNVTSAGEDG- 1 YYEPCTAN- -CTG YAANVFATDS 

RSSSATWQKSAAYVYQGVSNFTYYKLNNTNGLKS YELCEDYEYCTGYATNVFAPTV 

MKKLPWLWMPLIYGDKFPTSWSN CTD - -QCASYVANVFTTQP 

-ISYHWNRINYGYYMQFVNRTTYYAYNNTGG2U)JYTQLQLSECHTD-YCAGYAKNVFVP-I 
-LDNKLQGLLEISVCQYTMCEYPHTICHPKL-GNKRVELWHWDTGWSCLYKRNFTYDVN 
-LDNKLQGLLEISVCQYTMCEYPNTICHPNL-GNRRVELV^tfDTGVVSCLYKRNra 
-INGNLQGLLQISVCQYTMCEYPHTICHPNL-GNQRIELWHYDTDWSCLYRRNFTYDVN 

IIMASVCTYTICQLPYTPCKPNTNGNRVIGFWHTDVKPPICLLKRNFTFNVN 

PFFAVSKPMGTQTHTMIFDNAFNCTFEYISDAFSLDVSEKSGNFKHLREFVFKNKDG 

SNGWHLQGGAYAWNSSNYAJ3NAGSASECT VGVIKDVYNQSAASIAMTAPLQG 
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2 2 9E GGYIPSDFAFNN — WFLLTNTSSWDGWRSFQPLLLNCLWSVSGLRFTTGFVYFNGTGR 

PEDV NGHIPEGFSFNN — WFLLSNDSTLLHGKWSNQPLLVNCLIjAXPKiyGLGQFFSFNHTMD 

CCOV GGYIPHGFSFNN — WFMRTNSSTFVSGRFVTNQPLLVNCLWPVPSFGVAAQQFCFEGAQF 

PRC GGFIPSDFSFNN — WFLLTNSSTLVSGKLVTKQPLLVNCLWPVPSFEEAASTFCFEGADF 

FICV DGKIPEDFSFSW — WFLLSDKSTLVQGRVLSSQPVFVQCLRPVPSWSNNTAWHFKN-D 

BoCov ADYLYFHFYQEGGTFYAYFTDTGWTKFLFNVYLGTVLSHYYVLPLTCS SAMTLEY 

0C4 3 ADYLYFHFYQEGGIFYAYFTDTGWTKFLFNVYLGTVLSYYYVMPLTCN SAMTLEY 

PHEV ADYLYFHFYQEGGTFYAYFTDTGFVTKFLFKLYLGTVLSHYYVMPLTCN SALSLEY 

MHV APWLYFHFYQQGGTFYAYYADKPSATTFLFSVYIGDIIiTQYFVIiPFICTPTAGSTLAPLY 

T0R2_S FliYVYKGYQPIDWRDLPSGFNTLKPIFKLPLGINITNFRAILTAFSPAQDIWGTSAAAY 

AIBV MAWSKSQFCSAHCDFSEITVFVTHCYSSGSGSCPITGMIARGHIRISAMKWGSLFYNLTV 



2 2 9E GDCKGFSSDVLSDVIRYNLN-FEENIiRRGT XLFKTSYGV-WFYCTNNT 

PEDV GVCNGAAVDRAPEALRFNINDTSVILAEGS IVLHTALGTNIiSPVCSNS SD 

CCOV SQCNGVSLim'VDVIRFNrJSr-FTALVQSGMGATV-FSLOTTGGVILEISCYNDTO E 

PRC DQCNGAVLmTVDVlRFlSn^N-FTTWQSGKGATV-FSLNTTGGVTLEISCYiroTVS D 

FICV AFCP WTADVLRFWIJ^IFSDTDWTDSTNDEQLPFTFEDNTTAS lAC YSSANVTDFQ 

BoCov WVTPIjTSKQYLLAFNQDGVIFNAVDCKSDFMS EIKCKTLSIAPSTGVYELNG 

OC43 WVTPLTSKQYLLAFNQDGVIFNAVDCKSDFMS EIKCKTLSIAPSTGVYELNG 

PHEV WVTPLTTRQFLLAFDQDGVLYHAVDCASDFMS EIMCKTSSITPPTGVYELNG 

MHV WVTPLLKRQYLFNFNEKGVITSAVDCASSYIS EIKCKTQSLLPSTGVYDLSG 

TOR2_S FVGYLKPTTFMLKYDENGTITDAVDCSQNPLA ELKCSVKSFEIDKGIYQTSN 

AIBV SVSKYPNFKSFQCVWNFTSVYLNGDLVFTSNKTTDVTSAGVYFKAGGPVNYSIMK 

2 2 9E -LVSGDAHIPFGTVLGNFYCFVNTTIGTETTSAFVGALPKTVREFVISRTGHFYINGYRY 

PEDV -PHLAIFAIPLGATEVPYYCFLKVDTYNSTVYKFLAVLPSTVREIVITKYGDVYVNGFGY 

CCov ' SSFYSYGEISFGVTDGPRYCFA LYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNF 

PRC SSFSSYGEIPFGVTNGPRYCYV LYNGTALKYLGTLPPSVKEIAISKWGHFYINGYNF 

FICV PANNSVSHIPFGKT — AHFCFAN-FSHSIVSRQFLGILPPTVREFAFGRDGSIFVNGYKY 

BoCov -YTVQPIADVYRRIPNLPDCNIEAWLNDKSVPSPLNWERKTFSNCNFNMSSLMSFIQADS 

OC43 -YTVQPIADVYRRIPNLPTCNIEAWLNDKSVPSPLNWERKTFSNCOT 

PHEV -YTVQPVATVYRRIPDLPNCDIEAWLNSKTVSiSPLNWERKlFSNCNFmGRLMSFlQ^ 

MHV -YTVQPVGWYRRVPNLPDCKIEEWLTAKSVPSPLNWERRTFQNCNFNLSSLLRYVQAES 

TOR2_S -PRWPSGDWRPPNITNLCPFGEVFNATKFPSVYAWERKKI SNCVADYSVLYNSTFFST 

AIBV -EFKVLAYFVNGTAQDVILCDNSPKGLLACQYNTGNFSDGFYPFTNSTLVREKFIVYRES 

* 

2 2 9 E FTLGNVEAVNFNVTTAETTD FFTVALASYADVLVNVSQTSIANI lYCNSVINRLRC 

PEDV LHLGLLDAVTIYFTGHGTDDDVSGFWTIASTNFVDALIEVQGTSIQRILYCDDPVSQLKC 

CCov FSTFPIDCISFNLTTGDSGA FWTIAYTSYTDALVQVENTAIKKVTYCNSHINNIKC 

PRC FSTFPIDCISFNLTTGDSDV FWTIAYTSYTEALVQVENTAITNVTYCNSYVISJNIKC 

FICV FSLPAIRSVWFSISSVEEYG FWTIAYTNYTDVMVDVNGTAITRLFYCDSPLNRIKC 

BoCov FTCNNIDAAKIYGMCFSSIT IDKFAIPNGRKVDLQLGNLGYLQSFNYRIDTTATSC 

OC43 FTCNNIDAAKIYGMCFSSIT — — IDKFAIPNGRKVDLQLGNLGYLQSFNYRIDTTATSC 

PHEV FGCNNIDASRLYGMCFGSIT IDKFAIPNSRKVDLQVGKSGYLQSFNYKIDTAVSSC 

MHV LSCNNIDASKVYGMCFGSVS VDKFAIPRSRQIDLQIGNSGFLQTANYKIDTAATSC 

T0R2_S FKCYGVSATKLNDLCFSNVY ADSFWKGDDVRQIAPGQTGVIADYNYKLPDDFMGC 

AIBV SVNTTLALTWFTFTNVSNAQ PNSGGVHTFHLYQTQTAQSGYYNFNLSPLSQFVYKA 



229E DQLSFYVPDGFYSTSP — IQSVELPVSIVSLP VYHKHMFIVLYVDFKPQ 

PEDV SQVAFDLDDGFYPISSRNLLSHEQPISFVTLP SFNDHSFVNITVSAA 

CCov SQLTANLQNGFYPVAS— SEVGLVNKSWLLP SFYSHTSVNITIDLGMKR — 

PRC SQLTANLNNGFYPVSS — SEVGSVNKSWLLP SFLTHTIVNITIGLGMKR — 

FICV QQLKHELPDGFYSASM- -LVKKDLPKTFVTMP QFYHWMNVTLHWLNDTEKK 

BoCov -QLYYNLPAANVSVSRFNPSTWNRRFGFTEQPVFKPQPVGVFTHHDWYAQHCFKAPKNF 

OC43 -QLYYNLPAAIWSVSRFNPSIWNRRFGFTEQSVFKPQPAGVFTDHDVVYAQHCFKAPTNF 

PHEV -QLYYSLPAANVSVTHYNPSSWNRRYGPNNQS FGSRGLHDAVYSQQCPNTPNTY 

MHV -QLYYSLPKNlWTINimjPSSWNRRYGFKV^ 

TOR2_S -VLAWNTRNIDATSTG NYNYKYRYLRHG 

AIBV SDYMYGSYHPICAFRP ETINSGLWFNSLS 
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SGGGKCFWC YPAGVNITL ANFNETKG PLCVDTSHFT TKYVAVYAN 

FGGLSSANLVAS — DTTINGFSS PCVDTRQFTI TL»FYNVTNS 

SGYGQPIASTLS — NITLPMQDNNTD VYCIRSNRFSVYFHSTCKSSLWDDVFNS 

SGYGQPIASTLS — NITIjPMQDNNTD VYCVRSDQFSVYVHSTCKSALWDNVFKR 

YDIILAKAPELAALADVHFEIAQANGSVTNVTSLCVQARQLA LFYKYTSL 

— CPCKLDGSLCVGNGPGIDAGYKNSGIG TCPAGTNYLT CHNAA QCDC 

— CPCKLDGSLC VGNGPGIDAG YKNSG IG TCPAGTNYLT CHNAV QCNC 

— CPCRT — SQCIG G AGTG TCPVGTTVRK CFAAVTKATKCTC 
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VGRWSASINTGNCPFSFGKVNNFVKFGSVCFSLKDIPGG-CAMPIVANWAYSKYYT 

YGYVSKSQDS-NCPFTLQSVNDYLSFSKFCVSTSLLAGA-CTIDLFGYPAFGSGVK 

DCTDVLYATAVIKTGTCPFSFDKDNNYLTFNKFCLSLNPVGAN-CKFDVAARTRTNEQVV 
NCTDVLDATAVIKTGTCPFSFDKLNNYLTFNKFCLSLSPVGAN-CKFDVAARTRTNEQW 
QGLYTYSNLVELQlSr^DCPFSPQQFNlSre'LQFETLCFDVNPAVAG-CKWSLVHDVQWRTQFA 
LCTPDPITSKSTGPYKCPQTKYLVGIGEHCSGLAIKSDYCGGNPCTCQPQAFLGWSVDSC 
LCTPDPITSKSTGPYKCPQTKYLVGIGEHCSGLAIKSDYCGGNPCTCQPQAFLGWSVDSC 
WCQPDPSTYKGVNAWTCPQSKVSIQPGQHCPGLGLVEDDCSGNPCTCKPQAFIGWSSETC 

KLR PFERDISN— VPFSPDGKPCTPPALN-CYWPLNDYGFYTTTGI 

VSLTYGPLQGGYKQSVFSGKATCCYAYSYNGPRACKGVYSGELSRDFECG 
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IG TLYVSWSDGDGITGVPQ-PVEGVSSFMWTLDKCTKYNIYDVSGVGVIRVSNDT 

LT SLYFQFTKGELITGTPK-PLEGITDVSFMTLDVCTKYTIYGFKGEGIITLTNSS 

R SLYVIYEEGDNIVGVPS-DNSGLHDLSVLHLDSCTDYNIYGITGVGIIRQTWST 

R — SliYVIYEEGDS IVGVPS -DNSGLHDLSVLHLDSCTDYNIYGRTGVGI IRQTNRT 

T ITVSYKHGSMITTHAKGHSWGFQDTSVLVKDECTDYNIYGFQGTGIIRNTTSR 

LQGDRCNIFANFI FHDVNSGTTC - STDLQKSNTDI ILGVC VKTYDLYGITGQGIFVEVNAT 
LQGDRCNIFANFILHDVNSGTTC - STDLQKSNTDI ILGVCVNYDLYGITGQG IFVEVNAP 
LQNGRCNIFANFILNDVNSGTTC-STDLQQGNTIITTDVCVl^YDLYGITGQGILIEVNAT 

RCQIFANILLNGINSGTTC - STDLQLPNTEVATGVCVRYDLYGITGQGVFKEVKAD 

G YQPYRVWLSFELLNAPA-TVCGPKLSTDLIKNQCVNFNFNGLTGTGVLTPSSKR 

L LVYVTKSDGSRIQTRTEPLVLTQHNYNNITLDKCVAYNIYGRVGQGFXTNVTDS 
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FIiN-- 
ILA— 
LLS-- 
IiLS— 
LVA— 
YYNS- 
YYNS- 
YYNS- 
YYNS- 
FQP— 



GITYTSTSGNLLGFKDVTKGTIYSITPCNP PDQLWYQQAWGAM 

GVYYTSDSGQLLAFKWTSGAVYSVTPCSP SEQAAYVNDDIVGVI 

GLYYTSLSGDLLGFKNVSDGVIYSVTPCDV SAHAAVIDGAIVGAM 

GLYYTSLSGDLLGFKNVSDGVIYSVTPCDV SAQAAVIDGTIVGAI 

GLYYTSISGDLLAPKNSTTGEIFTWPCDL TAQVAVINDEIVGAI 

WQNLLYDSNGNLYGFRDYLTNRTFMIRSCYSG — RVSAAFHANS SEPAL 

WQNLLYDSNGNLYGFRDYLTNRTFMIRSCYSG — RVSAAFHANS SEPAL 

WQNLL YDS SGNL YGFRDYLSNRTFL IRSC YSG — RVSAVFHANS SEPAL 

WQALLYDVNGNLNGFRDLTTNKTYTIRSCYSG — RVSAAYHKEAPEPAL 

FQQFGRDVSDFTDSVRDPKTSEILDISPCAFGGVSVITPGTNASSEVAV 

VANFSYLADGGLAILDTSGAIDVFWQGSYGLNYYKVNPCEDVN — QQPWSGGNIVGIL 
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LSENFTSY GFSNWELPKFFYASNGTYN^ 

SSLSNST FNNTRELPGFFYHSNDGSN 

TS INSELL -GLTHWTTTPNF Y Y YS I YNYTNERTRGTAID — SND 

TS INS ELL GLTHWTITPNF YYYS I YNYTNDKTRGTPID — SND 

TAVNQTDLFEFVNNTQARRSRSSTPNFVTSYTMPQFYYITKWNNDTS~S 

LFRNIKCN YVFNNTLSRQLQPINYFDSYLGCWNADN STS 

LFRNIKCN YVFNNTLSRQLQPINYFDSYLGCWNADN STA 

MFRNLKCS HVFNNTILRQIQLVNYFDSYLGCWNAYN NTA 

LYRNINCS YVFTNNISREENPLNYFDSYLGCWNADN RTD 

LYQDVNCT DVSTAIHADQLTPAWRIYSTGNNVFQTQAGCLXGAEHV 

TSRNETGS E-QVENQFYVKLTNSSHRRRRS IG 
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- CTDAVLTYS SFGVC ADGS 1 1 AVQ - 
-CTEPVLVYSNIGVCKSGSIGYV — 
VDCEPI ITYSNIGVCKNGALVF I — 
VGCEPVITYSNIGVCKNGALVFI - - 



- PRNVS YDS VS AI VTANLS - 
- P SQYGQVKI APTVTGNI S ~ 
-NVTHSDGDVQPISTGNVT- 
-NVTHSDGDVQPI STGNVT- 



-NCTSAITYSSFAICNTGEIKYVNVTHVEIVDDSIGVIKPVSTGNIS 

SWQTCDLTVGSGYCVDYSTKRRSR-RAITTGYRFTNFEPFTVNSVNDSLEPVGGLYEIQ 
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SAVQTCDLWGSGYCVDYSTKRRSR-RAITTGYRPTNFEPFTWSVNDSLEHVGGLYEIQ 
SAVSTCDLTVGSGYCVDYVTALRSR-RSPTTGYRFTNFEPPAANLVNDSIEPVGGLYEIQ 
EALPNClSaiRMGAGLCVDYSKSRRAR-RSVSTGYRLTTFEPYMPMLiVNDSVQSVGGLYEMQ 

DTSYECDIPIGAGICASYHTVSLLRSTSQKSIVAYTMSLGADSSIAYSNN TIA 

QNVTSCPYVSYGRFCIEPDGSbKMI VPEEIiKOFVAPIiLNITES VL 



229E 

PEDV 

CCov 

PRC 

FICV 

BoCov 

OC43 

PHEV 

MHV 

T0R2_S 

AIBV 



IPSNWTISVQVEYLQITSTPIVVDCSTYVCNGNVRCVELLKQYTSACKTIEDAIIjRNSARL 

IPTNPSMSIRTEYLQLYNTPVSVDCATYVCNGNSRCKQLLTQYTAACKTIESALQLSARL 

I PTNFT I S VQ VE YIQ VYTTP VS IDC SR YVCNGNPRCNKLLTQ YVS ACQTI EQAIj AMG ARL 

I PTNFTI SVQVE YI QVYTTPVS IDC SRYVCNGNPRCNKLLTQYVS ACQTIEQALAl^GARL 

I PKNFTVAVQAE YIQ IQVKPVWDC ATYVCNGNTHCLKLLTQYTSACQT lENALNLGARL 

IPSEFTIGNMEEFIQTSSPKVTIDCSAFVCGDYAACKSQIiVEYGSFCDNlNAlIiTEVNEL 

IPSEFTIGNMEEFIQTSSPKVTIDCSAFVCGDCAACKSQLVEYGSFCDNINAIIiTEVNEL 

IPSEFTIGNLEEFIQTRSPKVTIDCATFVCGDYAACRQQLAEYGSFCENINAILTEVNEL 

IPTNFTIGHHEEFIQIRAPKVTIDCAAFVCGDNAACRQQXjVEYGSFCDNVNAILNEVNN 

IPTNFSISITTEVMPVSMAKTSVDCNiynriCGDSTECAmiLLQYGSFCTQL]!^ 

IPNSPNLTVTDEYIQTRMDKVQINCLQYVCGNSLECRKLFQQYGPVCDNILSWWSVSQK 



229E 

PEDV 

CCov 

PRC 

FICV 

BoCov 

OC43 

PHEV 

MPIV 

TOR2_S 

AIBV 



ESADVSEMIiTPDKKAFTLANVSSF-GD YNLSSVIPS LPTSGSR — 

ESVEVNSMLTlSEEAIiQIiATISSFNGDG YNFTNVLGASVY DPASGRV — 

ENMEIDSMIiFVSENALKLASVEAFNSTETIjDPIYKEWPNIGGSWLGGLKDIIiPSHNSK^ 
ENMEVDSMLFVSENALKLASVEAFNSSETLDPIYTQWPNIGGFWLEGLKYILPSDNSK— 

E SLMLNDMITVSDRGLELATVERFNATA^ LGGEKLGGLYFDG LS SLLPPK — 

LDTTQLQVANSLMNGVTLSTKLKDGVN FNVDDINFSPVLG CLGSACNK — 

LDTTQIiQVANSLMNGVTLSTKLKDGVN FNVDDVNFSPVIiG CLGSECNK-- 

LDTTQLQVANSLMNGVTLSTKIKDGIN FNVDDINFSPVLG- -'-CLGSECNR — 

LDNMQLQVASALMQGVTISSRLPDGIS GPIDDINFSPLLG— -CIGSTCAEDG 

QDRNTREVFAQVKQMYKTPTLKYFGGF NFSQ ILPDPLKP 

EDMELLSFYSSTKPKGYDTPVIiSNVSTG EFNISLLLTPPSSP 



229E 

PEDV 

CCov 

PRC 

FICV 

BoCov 

OC43 

PHEV 

MHV 

T0R2_S 

AIBV 



VAGRSAIEDILFSKIVTSGLGTVDADYKNCTKGLS — lADLACAQYYNGIMVLPG 

VQKRSVIEDLIiFNKWTJSFGLGTVDEDYKRCSNGRS — VADLVCAQYYSGVMVLPG 

RKYRSAIEDLLFDKWTSGLGTVDEDYKRCTGGYD — lADLVCAQYYNGIMVLPG 

RKYRSAIEDLLFSKWTSGLGTVDEDYKRCTGGYD — lADLVCAQYYNGIMVLPG 

IGKRSAVEDLLPNKWTSGLGTVDDDYKKCS SGTD — VADLtVCAQYYNGIMVLPG 

VSSRSAIEDLLFSKVKIjSDVG-FVEAYNNCTGGAE — IRDIilCVQSYNGIKVLPP 

VSSRSAIEDLLFSKVRLSDVG-FVEAYNNCTGGAG— IRDLICVQSYNGIKVLPP 

ASTRSAIEDLLFDKVKIiSDVG-FVQAYNNCTGGAE- - IRDIilCVQSYNGIKVLPP 

NGPSAIRGRSAIEDLLFDKVKLSDVG-PVEAYNNCTGGQE — VRDLLCVQSFNGIKVLPP 

TKRSFIEDLLFNKVTLADAG-FMKQYGECLGDIN — ARDLICAQKFNGLTVLPP 

SGRSFVEDLLPTSVETVGLP-TDAEYKKCTAGPLGTLKDLICAREYNGLLVLPP 



229E 

PEDV 

CCov 

PRC 

FICV 

BoCov 

OC43 

PHEV 

MHV 

T0R2_S 

AIBV 



VADAERMAMYTGSLIGGIALGGLT SAVSIPFSLAIQARLNYVALQTDVLQENQKIL 

WDAEKLHMYSASLIGGMALGGIT AAAALPFSYAVQARLNYLALQTDVLQRNQQLL 

VANDDKMAMYTASLAGGITLGSLGG GAVSIPFAIAVQARLNYVALQTDVLNKNQQIL 

VANADKMTMYTASLAGGITLGAFGG GAVSIPFAVAVQARLNYVALQTDVLNKNQQIL 

WDGNKMSMYTASLIGGMALGSIT SAVAVPPAMQVQARLNYVALQTDVLQENQKIL 

LLSVNQISGYTLAATSASLFPPLS -AAVGVPPYLNVQYRINGIGVTMDVLSQNQKLI 

LLSDNQISGYTLAATSANLFPPWS AAAGVPPYLNVQYRINGIGVTMDVLSQNQKLI 

LLSENQISGYTLAATAASIiFPPWT -AAAGVPFYLNVQYRINGLGVTMDVLSQNQKLI 

VLSESQISGYTAGATAAAMFPPWT AAAGVPFSLNVQYRINGLGVTMNVLSENQKMI 

LLTDDMI7VAYTAALVSGTATAGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKQI 
IITADMQTMYTASLVGAMAFGGIT SAAAIPFATQIQARINHLGIAQSLLMKNQEKI 



229E 

PEDV 

CCov 

PRC 

FICV 

BoCov 

OC43 

PHEV 

MHV 

T0R2_S 

AIBV 



AASPNKAMTNIVDAFTGVNDAITQTSQALQTVATALNKIQDWNQQGNSLNHLTSQLRQW 
AESFNSAIGNITSAFESVKEAISQTSKGLNTVAHALTKVQEWNSQGSALNQLTVQLQHN 
ANAFNQAIGNITQAFGKVTOAIHQTSQGLATVAKVLAKVQDVVKTQGQALSHLTLQLQNN 
ASAFNQAIGNITQSFGKVNDAIHQTSRGLTTVAKALAKVQDVVNTQGQALRHLTVQLQNN 
ANAFNNAIGNITLALGKVSNAITTTSDGPNSMASTU^TKIQSWNQQGEALSQLTSQLQKN 

ANAFNNALDAIQEGFDATN S - ALVKIQAWNANAEALNNLLQQL SNR 

ANAFNNALDAIQEGFDATN S - ALVKIQAWNADAEALlSrNLLQQLSNR 

ASAFNNALDAIQEGFDATN S -ALVKIQAWNANAEALNNLLQQL SNR 

ASAFNNALGAIQEGFDATN S-ALGKIQSWNANAEALNNLLNQLSNR 

ANQFNKAISQIQESLTTTS TALGKLQDWNQNAQALNTLVKQLSSN 

AASFNKAIGHMQEGFRSTS LALQQVQDWNKQSAILTETMNSLNKN 

* ** *. . . * ..* *** . * .* . 
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229E FQAISSSIQAIYDRLDTIQADQQVDRLITGRLAAIiNVFVSHTLTKYTEVRASRQLAQQKV 

PEDV FQAI SSSIDDI YSRLDILLADVQVDRLITGRLSALNAFVAQTLTKYTEVQASRKLAQQKV 

CCOV FQAISSSISDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKV 

PRC FQAI SSS I SDIYNRLDELSADAQVDRLITGRLTALNAFVSQTLTRQAEVRASRQLAKDKV 

F ICV FQAI SSS I AE I YNRLEKVEADAQVDRLITGRLAALNAYVSQTLTQYAEVKASRQ I ALEKV 

BoCov FGAISSSLQEILSRLDALEAQAQIDRLINGRLTALNVYVSQQLSDSTLVKFSAAQAMEKV 

OC43 FGAISSSLQEILSRLDALEAQAQIDRLINGRLTALDAYVSQQLSDSTLVKFSAAQAMEKV 

PHEV FGAISASLQEILSRLDALEAKAQIDRLINGRLTALNAYVSQQLSDSTLVKFSAAQAIEKV 

MHV FGAISASLQEILTRLDAVEAKAQIDRLINGRLTALNAYISKQLSDSTLIKFSAAQAIEKV 

T0R2_S FGAISSVLiNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKM 
AIBV > FGAISSVIQDIYAQLDAIQADAQVDRLiITGRIiSSLSVLASAKQSEYIRVSQQRELATQKI 

2 2 9E NECVKSQSKRYGFCG-NGTHIFSIVNAAPEGLVFLHTVLIiPTQYKDVEAWSGriCV-DG — 

PEDV NECVKSQSQRYGFCGGDGEHIFSLVQAAPQGIiLFLHTVLVPGDFV3WIiAIAGLCV-NG — 

CCov WECVRSQSQRFGFCG-NGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICASDGDR 

PRC NECVRSQSQRFGFCG-NGTHLFSLANAAPNGMIFFHTVLLPTAYETVTAWSGICALDGDR 

FICV NECVKSQSNRYGFCG-NGTHLFSLVNSAPEGLLFFHTVLLPTEWnSEWAWS — 

BoCov NECVKSQSSRINFCG-NGNHI I SLVQNAPYGLYFIHFS YVPTKYVTAKVSPGLC I — 

OC43 NECVKSQSSRINFCG-NGNHIISLVQNAP YGLYFXHFS YVPTKYVTAKVSPGLCI 

PHEV NECVKSQSSRINFCG-NGNHIISLVQNAPYGIiYFIHFSYVPTKYVTAKVSPGLCI 

MHV NECVKSQTTRINFCG-NGNHILSLVQNAPYGLCFIHFSYVPTSFKTANVSPGLCI 

TOR2„S SECVLGQSKRVDFCG-KGYHLMSFPQAAPHGWFLHVTYVPSQERNFTTAPAICH 

AIBV > NECVKSQSNRYGFCG-SGRHVLSIPQNAPNGIVFIHFTYTPETFVNVTAIVGFCVNPLNA 



22 9 E TNGYVDRQPNLALYK -EGNYYRITSRXMFEPRIPTMADFVQIENCWVTFVNTSRS 

PEDV EIALTLREPGLVLFTHEIiQTYTATEYFVSSRRMFEPRKPTVSDFVQIESCWTYVNLTSD 

CCov TFGLWKDVQLTLFRN LDDKFYLTPRTMYQPIVATSSDFVQIEGCDVLFVNATVI 

PRC TFGLWKDVQLTLFRN LDDKFYLTPRTMYQ PRVATS SDFVQ I EGCDVLF VNTTVS 

FICV -YAYVLKDFDHSIFS YNGTYMVTPRNMFQPRKPQMSDFVQITSCEVTFLNMTYT 

BoCov -AGDRGIAPKSGYFVN VNNTWMFTGSGYYYPEPITGNNVVVMSTCAVNYTKAPDV 

0C4 3 -AGDRGIAPKSGYFVN VNNTWMFTGSRYYYPEPITGNIWVVMSTCAVNYTKAPDV 

PHEV -AGDIGISPKSGYFIN VNWSWMFTGSSYYYPEPITQNNVVVMSTCAVNYTKAPDL 

MHV - SGDRGLAPKAGYFVQ DNGEWKFTGSNYYYPEPITDKNSVAMI SCAVNYTKAPEV 

TOR2_S -EGKAYPPREGVFVFN GTSWFITQRNFFSPQIITTDNTFVSGNCDWIGIINNT 

AIBV > SQYAIVPANGRGIFIQ VNGTYYITSRDMYMPRDITAGDIVTLTSCQANYVNVNKT 



22 9E ELQTIVP-EYIDVNKTLQEIiSYKIi-PNYTVPDIiV VEQYNQTIliNLTSEISTLENKSA 

PEDV QLPDVXP-DYIDVNKTDDEIIjASL-PNRTGPSLP IiDVFNATYLNIiTGEIADIiEQRSE 

CCov DIjPSIIP-DYIDINQTVQDIIiENFRPNWTVPELP LDIFNATYLNLTGEINDLEFRSE 

PRC DLPSI IP-DYIDINQTVQDILENFRPNWTVPELT LDVFNATYLNLTGE IDDLEFRSE 

FICV TFQEIVI-DYIDINKTIADMLEQYNPNYTTPELNL-LLDIFNQTKLNLTAEIDQLEQRAD 

BoCov MLNISTP-NLHDFKEELDQWFKNQ--TSVAPDLSL~DY — IWTFLDLQDEMN 

OC43 MLNISTP-NLPDFKEELDQWFKNQ — TLVAPDLSL-DY — INVTFLDLQDEMN 

PHEV MLNTSTP-NLPDFKEELYQWFKNQ'--SSVAPDLSL-DY — INVTFLDLQDEMN 

MHV FLNNSIP - WLPDFKEELDKWFKNQ — TS lAPDL SL~ DFEKLNVTFLDLTYEMN 

T0R2_S VYDPLQP-ELDSFKEELDKYFKNH^ TSPDVDLGDISGINASWNIQKEID 

AIBV > VITTFVEDDDFNFDDELSKWWWDT — KHGLPDFD DFNYTVPILNISGBID 

: . . * . . ..:::*: 

2 2 9E ELNYTVQKLQTLIDNINSTLVDLKWLNRVETYIKWPWWWLC I SWLIFWSMLLLCCCS 

PEDV SLRNTTEELRSLINNINNTLVDLEWLNRVETYIKWPWWVWLIIVIVLIFWSLLVFCCIS 

CCov KLHNTTVELAILIDNINNTLVNLEWLNRIETYVKWPVm/WLLIGIiVVIFCIPILLFCCCS 

PRC KLHNTTVELAILIDNINNTLVNLEWIJimiETYVKWPWYVWLLIGLVVIFCIPLLLFCCCS 

FICV NLTTIAHELQQYIDNLNKTLVDLDWLNRIETYVKWPWYVWLLIGLVVVFCIPLLLFCCLS 

BoCov RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGFAGVAMLVLLFFICCC 

OC43 RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGFAGVAMLVLLFFICCC 

PHEV RLQEAIKVLNQSYINLKDIGTYEYYVKWPWYVWLLIGLAGVAMLVLLFFICCC 

MHV RIQDAIKKLNESYINLKEVGTYEMYVKWPWYVWLLIGLAGVAVCVLLFFICCC 

TOR2_S RLNEVAKNLJraSLIDLQELGKYEQYIKWPWYVWLGPIAGLIAIVMVTILIiCCM 

AIBV > NIQGVIQGLNDSLINLEELSIIKTYXKWPWYVWIiAIGFAIIIFILILGWVFFM 

.: . ;*.: ::*. :, : *;****:*** : : ; 
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229E TGCCG-FFSCFASSIRGCCESTKL-PYYDVEKIHIQ 

PEDV TGCCG-CCGCCGACFSGCCRGPRTiQPYEAFEKVHVQ 

CCov TGCCG-CIGCLGSCCHSICSRRQFESYEPIEKVHVH 

PRC TGCCG^-CIGCLGSCCHSIFSRRQFENYEPIEKVHVH 

FICV TGFCG-CFGCVGSCCHSLCSRRQFETYEPIEKVHIH 

BOCOV TGCGTSCFKICGGCCD-DYTGHQEIiVIK- — TSHDD 

OC43 TGCGTSCFKKCGGCCD-DYTGHQELVIK TSHEG 

PHEV TGCGTSCFKKCGGCCD-DYTGHQEFVIK TSHDD 

MHV TGCGSCCFRKCGSCCD-EYGGHQDSIVIHNISAHED 

T0R2_S TSCCSCLKGACSCGSCCKFDEDDSEPVLKGVKLHYT 

AIBV > TGCCGCCCGCFGIIPLISKCGKKSSYYTTFDNDWTEQYRPKKSV 



Key 


Name 


Genbank % ID* 










229E 


spike glycoprotein [Human coronavirus 22 9E] . 


AAK32191 28 


.6% 


(SEQ 


ID 


NO: 


53) 


AIBV 


spike glycoprotein [Avian infectious bronchitis virus] . 


AA034396 27 


.6% 


(SEQ 


ID 


NO: 


54) 


BoCov 


E2 glycoprotein precursor (Spike glycoprotein) 


P25193 30, 


.5% 


(SEQ 


ID 


NO: 


55) 


CCoV 


spike protein - canine coronavirus 


S41453 26, 


.1% 


(SEQ 


ID 


NO: 


56) 


FICV 


peplomer protein [Feline infectious peritonitis virus] . 


BAA06805 25, 


.4% 


(SEQ 


ID 


NO: 


57) 


MHV 


E2 glycoprotein precursor (Spike glycoprotein) 


P11225 31 


.9% 


(SEQ 


ID 


NO: 


58) 


OC43 


surface protein ~ human coronavirus 


S44241 30, 


.7% 


(SEQ 


ID 


NO: 


59) 


PEDV 


spike protein [Porcine epidemic diarrhea virus] . 


CAA80971 26, 


.0% 


(SEQ 


ID 


NO: 


60) 


PHEV 


spike glycoprotein [porcine hemagglutinating encephalomyelitis 


virus] AMiSO 031 30 


.5% 


(SEQ 


ID 


NO: 


61) 


PRC 


S protein [Porcine respiratory coronavirus] . 


AAA469a5 27, 


.5% 


(SEQ 


ID 


NO: 


62) 



T0R2_S Sars associated virus S glycoprotein (SEQ ID NO: 33) 
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10 20 30 40 50 

TOR2_E MYSFVSEETGTLrVNSVLLFLAFWFLLOTIjAILTALRLCAYCasriVWSLV^ 

PGV MTFPRALWIDDNG-MVINIIFWFLLIIILILLSIALLNIIKLCMVCCNLGRTVIIVPAQ 
10 20 30 40 50 

60 70 
TOR2„E YVYSRVKNIiNSSEGVPDLIjV (SEQ ID NO: 35) 

PGV HAYDAYKNFMRIKAYKPDGALLA (SEQ ID NO: 63) 
60 70 80 
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MESLYLGVNEKTHVQLSLPVLQVRDVLVRGFGDSVEEALSEAREHLKNGT 

CGLVELEKGVLPQLEQPYVFIKRSDALSTNHGHKWELVAEMDGIQYGRS 

GITLGVLVPHVGETPIAYRNVLLRKNGNKGAGGHSYGIDLKSYDLGDELG 

TDPIEDYEQISJWNTKHGSGALjRELTRELNGGAVTRYVDNNFCGPDGYPLDC 

IKDFLARAGKSMCTLSEQLDYIESKRGVYCCRDHEHEIAWFTERSDKSYE 

HQTPFEIKSAKKFDTFKGECPKFVFPLNSKVKVIQPRVEKKKTEGFMGRI 

RSVYPVASPQECNWMHIiSTLMKCNHCDEVSWQTCDFLKATCEHCGTENLV 

lEGPTTCGYLPTNAWKMPCPACQDPEIGPEHSVADYHNHSNIETRLRKG 

GRTRCFGGCVFAYVGCYNKRAYWVPRASADIGSGHTGITGDNVETLNEDIi 

LEILSRERWINIVGDFHLNEEVAIILASFSASTSAFIDTIKSLDYKSFK 

TIVESCGNYIWTKGKPVKGAWNIGQQRSVLTPLCGFPSQAAGVIRSIFAR 

TLDAANHSI PDLQRAAVTILDGISEQSLRIjVDAMVYTSDIiLTNSVI IIIAY 

VTGGLVQQTSQWLSNLLGTTVEKLRPIFEWIEAKLSAGVEFLKDAWEIIiK 

FLITGVFDIVKGQIQVASDNIKDCVKCFIDWNKALEMCIDQVTIAGAKL 

RSLNLGEVFIAQSKGLYRQCIRGKEQIiQIiLMPLKAPKEVTFLEGDSHDTV 

LTSEEWLKNGEIjEALETPVDSFTNGAIVGTPVCVNGLMLLEIKDKEQYC 

ALSPGLLATNNVFRLKGGAPIKGVTFGEDTVWEVQGYKNVRITFELDERV 

DKVLNEKCSVYTVESGTEVTEPACWAEAWKTLQPVSDLLTNMGIDLDE 

WSVATFYLFDDAGEENFSSRMYCSFYPPDEEEEDDAECEEEEIDETCEHE 

YGTEDDYQGLPLEFGASAETVRVEEEEEEDWLDDTTEQSEIEPEPEPTPE 

EPVNQFTGYLKLTDNVAIKCVDIVKEAQSANPMVIVNAANIHLKHGGGVA 

GALNKATNGAMQKESDDYIKLNGPLTVGGSCLLSGHNLAKKCLHWGPNL 

NAGEDIQLLKAAYENFNSQDILIiAPLLSAGIFGAKPLQSLQVCVQTVRTQ 

VYIAVNDKALYEQWMDYLDNLKPRVEAPKQEEPPNTEDSKTEEKSWQK 

PVDVKPKIKACIDEVTTTLEETKFIiTNKLIjLFADINGKIjYHDSQNMLRGE 

DMSFLEKDAPYMVGDVITSGDITCWIPSKKAGGTTEMIjSRALKKVPVDE 

YITTYPGQGCAGYTLEEAKTALKKCKSAFYVLPSEAPNAKEEILGTVSWN 

LREMLAHAEETRKLMPICMDVRAIMATIQRKYKGIKIQEGIVDYGVRFFF 

YTSKEPVASIITKLNSLNEPLVTMPIGYVTHGFNLEEAARCMRSLKAPAV 

VSVSSPDAVTTYNGYLTSSSKTSEEHFVETVSLAGSYRDWSYSGQRTELG 

VEFLKRGDKIVYHTLESPVEFHLDGEVLSIiDKLKSLLSLREVKTIKVFTT 

VDNTNLHTQLVDMSMTYGQQFGPTYLDGADVTKIKPHVNHEGKTFFVLPS 

DDTLRS EAFE YYHTLDE S FLGRYMSALNHTKKWKFPQVGGIiTSIKWADNN 

CYL S SVLLALQQLEVKFNAPALQEAYYRARAGDAANFCALIIiAYSNKTVG 

ELGDVRETMTHLLQHANLESAKRVLNWCKHCGQKTTTLTGVEAVMYMGT 

LSYDNLKTGVSIPCVCGRDATQYLVQQESSFVMMSAPPAEYKLQQGTFLC 

ANEYTGNYQCGHYTHITAKETLYRIDGAHIiTKMSEYKGPVTDVFYKETSY 

TTTIKPVSYKLDGVTYTEIEPKLDGYYKKDNAYYTEQPIDLVPTQPLPNA 

SFDNFKLTCSNTKFADDLNQMTGFTKPASRELSVTPFPDLNGDWAIDYR 

HYSASFKKGAKLLHKPIVWHINQATTKTTFKPNTWCLRCLWSTKPVDTSN 

SFEVLAVEDTQGMDNLACESQQPTSEEWENPTIQKEVIECDVKTTEWG 

NVILKP SDEGVKVTQELGHEDLMAAYVENTS ITIKKPNELSLALGLKTI A 

THGIAAINSVPWSKILAYVKPFLGQAAITTSNCAKRLAQRVFNNYMPYVF 

TLIiFQLCTFTKSTNSRIRASLPTTIAKNSVKSVAKLCLDAGINYVKSPKF 

SKLFTIAMWLLLLSICLGSIiICVTAAFGVLIiSNFGAPSYCNGVRELYLNS 

SOTTTMDFCEGSFPC S ICL SGLDSLDS YPALETI QVTI S S YKLDLTILGL 

AAEWVLAYMLFTKFFYLLGLSAIMQVFFGYFASHFISNSWLMWFIISIVQ 

MAPVSAMVRMYIFFASFYYIWKSYVHIMDGCTSSTCMMCYKRNRATRVEC 

TTIWGMKRSFYVYANGGRGFCKTHNWNCIiNCDTFCTGSTFISDEVARDL 

SLQFKRPINPTDQSSYIVDSVAVKNGALHLYFDKAGQKTYERHPLSHFVN 

LDNLRANNTKGSLPINVIVFDGKSKCDESASKSASVYYSQLMCQPILLLD 

QALVSDVGDSTEVSVKMFDAYVDTFSATFSVPMEKLKALVATAHSELAKG 

VALDGVLSTFVSAARQGWDTDVDTKDVIECLKLSHHSDLEVTGDSCNNF 

MLTYNKVENMTPRDLGACIDCNARHINAQVAKSHNVSLIWNVKDYMSLSE 

QLRKQIRSAAKKNNIPFRLTCATTRQWNVITTKISLKGGKIVSTCFKLM 

LKATLLCVLAALVCYIVMPVHTLSIHDGYTNEIIGYKAIQDGVTRDIIST 

DDCFANKHAGFDAWFSQRGGSYKNDKSCPWAAIITREIGFIVPGLPGTV 

LRAINGDFLHFLPRVFSAVGNICYTPSKLIEYSDFATSACVLAAECTIFK 

DAMGKPVPYCYDTNLLEGSISYSEIiRPDTRYVIiMDGSIIQFPNTYLEGSV 

RVVTTFDAEYCRHGTCERSEVGICLSTSGRWVIjNNEHYRAIiSGVFCGVDA 

MNLIANIFTPLVQPVGAIiDVSASWAGGIIAILVTCAAYYFMKFRRVFGE 

YNHWAANALLFLMSFTILCLVPAYSFLPGVYSVFYLYIiTFYFTNDVSFL 

AHLQWFAMFSPIVPFWITAIYVFCISLKHCHWPFNNYLRKRVMFNGVTFS 

TFEEAALCTFLLNKEMYLKLRSETLLPLTQYNRYLALYTSFKYKYFSGALDT 

TSYREAACCHLAKAIiNDFSNSGADVLYQPPQTSITSAVLQSGFRKMAFPS 

GKVEGCMVQVTCGTTTLNGLWLDDTVYCPRHVICTAEDMLNPNYEDLLIR 

KSNHSFLVQAGNVQLRVIGHSMQNCLLRLKVDTSNPKTPKYKFVRIQPGQ 

TFSVLACYNGSPSGVYQCAMRPNHTIKGSFLNGSCGSVGFNIDYDCVSFC 

YMHHMELPTGVHAGTDLEGKFYGPFVDRQTAQAAGTDTTITLNVIjAWLiYA 

AVINGDRWFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPIiSAQTGIA 

VLDMCAALKELLQNGMNGRTILGSTILEDEFTPFDWRQCSGVTFQGKFK 
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KIVKGTHHWMLLTFLTSLLILVQSTQWSLFFFVYENAFLPFTLGIMAIAA 

CAMLLVKHKHAFLCLFLLPSLATVAYFNMVYMPASWVMRIMTWLELADTS 

liSGYRLKDCVMYASALVLLILMTARTVYDDAARRVWTLMWITLVYKVr^ 

GNALDQAISMWALVISVTSNYSGWTTIMFLARAIVFVCVEYYPLLFITG 

NTLQCIMLVYCFLGYCCCCYFGIiFCLLNRYFRLTLGVYDYLVSTQEFRYM 

NSQGLLPPKSSIDAFKLNIKLLGIGGKPCIKVATVQSKMSDVKCTSWLL 

SVLQQLRVESSSKLWAQCVQLHNDILLAKDTTEAPEKMVSLLSVLLSMQG 

AVDINRLCEEMLDNRATLQAIASEFSSLPSYAAYATAQEAYEQAVANGDS 

EWLKKLKKSLNVAKSEFDRDAAMQRKLEKMADQAMTQMYKQARSEDKRA 

KVTSAMQTMLFTMLRKLDNDALNNI INNARDGC VPLWI I PLTTAAKLMW 

VPDYGTYKNTCDGNTFTYASAIiWEIQQWDADSKIVQLSEIISnviDNSPNLA 

WPLIVTALRANSAVKLQNNELSPVALRQMSCAAGTTQTACTDDNALAYYN 

NSKGGRFVIiALLSDHQDLKWARFPKSDGTGTIYTELEPPCRFVTDTPKGP 

KVKYLYFIKGLmLNRGMVLGSLAATVRLQAGNATEVPANSTVLSFCAF^ 

VDPAKAYKDYLASGGQPITNCVKiyn^CTHTGTGQAITVTPEAlQim 

ASCCLYCRCHIDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLRNTVCTVC 

GMWKGYGCSCDQLREPLMQSADASTF 

(SEQ ID NO: 64) 
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FKRVCG 

VSAARLTPCGTGTSTDWYRAFDIYNEKVAGFAKFLKTNCCRFQEKDEEG 

NLLDSYFVVKRHTMSKTX'QHEETiyNLVKDCPAVAVHDFFKFRVDGDMVPH 

ISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKKDWY 

DFVENPDILRVYMJLGERWQSLIiKTVQFCDiyyiRDAGIVGVIiT^ 

GNWYDFGDFVQVAPGCGVPIVDSYYSLLMPILTIiTRAIiAAESHMDADLAK 

PLIKWDLLKYDFTEERLCLFDRYFKYWDQTYHPNCINCLDDRCILHCANF 

ITVLFSTVFPPTSFGPLVRKIFVDGVPFWSTGYHFRELGVVHNQDVNLHS 

SRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAALTNNVAFQTVKPG 

NFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDYYRYNLPTM 

CDIRQLLF WEWDKYFDC YDGGC INANQVI VISINLDKSAGF PFNKWGKAR 

liYYDSMSYEDQDALFAYTKRWIPTITQMNLKYAISAKJSTRARTVAGVSIC 

STMTNRQFHQKIiliKS I AATRGATWIGTSKFYGGWHNMLKTVY SDVETPH 

liMGWDYPKCDRAMPNiyiliRII^ASLVIjARKHNTCCNLSHRFYRIiANECAQVL 

SEMVMCGGSLYVKPGGTSSGDATTAYANSVFNICQAVTANWALLSTDGN 

KIADKYVRNLQHRLYECLYRNRDVDHEFVDEFYAYLRKHFSMMILSDDAV 

VCYNSNYAAQGLVASIKNFKAVLYYQNNVFMSEAKGWTETDLTKGPHEFC 

SQHTMLVKQGDDYVYIiP YPDP SRI LGAGCFVDDIVKTDGTLMIERFVSLA 

IDAYPLTKHPNQEYADVFHLYLQYIRKLjHDEIjTGHJyaiDMYSVMLTNDNTS 

RYWEPEFYEAMYTPHTVLQAVGACVLCNSQTSLRCGACIRRPFLCCKCCY 

DHVI STSHKLVLSVNPYVCNAPGCDVTDVTQLYLGGMSYYCKSHKPP I SF 

PLCANGQVFGLYKNTCVGSDNVTDFNAIATCDWTNAGDYILANTCTERLK 

LFAAETLKATEETFKLSYGIATVREVLSDRELHLSWEVGKPRPPIJmiSn^ 

FTGYRVTKNSKVQIGEYTFEKGDYGDAWYRGTTTYKLNVGDYFVLTSHT 

VMPLSAPTLVPQEHYVRITGLYPTLNI SDEFS SNVANYQKVGMQKYSTLQ 

GPPGTGKSHFAIGLALYYPSARIVYTACSHAAVDALCEKAbKYLPIDKCS 

RI I PARARVECFDKFKVNSTLEQYVFCTVNALPETTADIVVFDEI SMATN 

YDLSWNARLRAKHYVYIGDPAQLPAPRTLLTKGTLEPEYFNSVCRLMKT 

IGPDMFLGTCRRCPAEIVDTVSALVYDNKLKAHKDKSAQCFKMFYKGVIT 

HDVSS AINRPQ IGWREFLTRNPAWRKAVF I S PYNSQNAVASKILGLPTQ 

TVDSSQGSEYDYVIFTQTTETAHSCNVNRFNVAITRAKIGILCIMSDRDL 

YDKLQFTSLEIPRRWATLQAENVTGLFKDCSKIITGLHPTQAPTHLSVD 

IKFKTEGLCTOI PG I PKDMTYRRL I SMMGFKMNYQWGYPl^MF ITREEA 

RHVRAWIGFDVEGCHATRDAVGTNLPLQLGFSTGVNLVAVPTGYVDTENN 

TEFTRVNAKPPPGDQFKHLIPLMYKGLPWNWRIKIVQMLSDTLKGLSDR 

WFVLWAHGFELTSMKYFVKIGPERTCCLCDKRATCFSTSSDTYACWNHS 

VGFDYVYNPFMIDVQQWGFTGNLQSNHDQHCQVHGNAHVASCDAIMTRCL 

AVHECFVKRVDWSVEYPIIGDELiRVNSACRKVQHMWKSALLADKFPVLH 

DIGNPKAIKCVPQAEVEWKFYDAQPCSDKAYKIEELFYSYATHHDKFTDG 

VCIiFVmCNVDRYPANAIVCRFDTRVLSNIiNLPGCDGGSLYVNKHAFHTPA 

FDKSAFTNLKQLPFFYYSDSPCBSHGKQWSDIDYVPLKSATCITRCNLG 

GAVCRHHANEYRQYLDAYNMMISAGFSLWIYKQFDTYNIiWNTFTRLQSIjE 

NVAY1WVNKGHFDGHAGEAPVSIINNAWTKVIX3IDVEIFENKTTLPVNV 

AFELWAKRNIKPVPEIKILKnsnijGVDIAANWIWDYKREAPAHVSTIGVCT 

MTDIAKKPTESACSSLTVLFDGRVEGQVDIiFRNARNGVLITEGSVKGLTP 

SKGPAQASVNGVTLIGESVKTQFNYFKKVDGIIQQLPETYFTQSRDLEDF 

KPRSQMETDFLELAMDEFIQRYKLEGYAFEHIVYGDFSHGQLGGLHLMIG 

DAKRSQDSPLKLEDPIPlMDSTVKNYFITDAQTGSSKCVCSVIDLriliDDFV 

EI IKSQDLSVI SKWKVTIDYAEI SFMLWCKDGHVETFYPKLQAS QAWQP 

GVAMPNLYKMQRMLLEKCDLQNYGENAVIPKGIMyiNVAKYTQLCQYLNTL 

TLAVPYNMRVIHFGAGSDKGVAPGTAVIiRQWLPTGTLLVDSDLNDFVSDA 

DSTLIGDCATVHTANKWDLIISDMYDPRTKHVTKENDSKEGFFTYLCGFI 

KQKL ALGG S I AVK I TEH S WNADL YKLMGHF S WWT AF VTNVNAS S S E AFL I 

GANYLGKPKEQIDGYTMHANYIFWRNTNPIQLSSYSLFDMSKFPLKLRGT 

AVMSLKENQINDMIYSLLEKGRIillRENNRVWSSDILVlSIN 

(SEQ ID NO: 65) 



FIGURE 17 
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MDLFMRFFTLRSITAQPVKIDNASPASTVHATATIPLQASLPFGWLiVIGV 
AFLAVFQSATKI I ALNKRWQLALYKGPQFI CNLLLLFVTI YSHLLLVAAG 
MEAQFLYLYALIYFLQCINACRIIMRCWLCWKCKSKNPLLYDANYFVCWH 
THNYDYCIPYNSVTDTIWTEGDGISTPKLKEDYQIGGYSEDRHSGVKDY 
VWHGYFTEVYYQLESTQITTDTGIENATFFIFNKLVKDPPNVQIHTIDG 
SSGVANPAiyiDPIYDEPTTTTSVPL (SEQ ID NO : 66) 

FIGURE 18 



MMPTTLiFAGTHITMTTVYHITVSQIQLSLLKVTAFQHQNSKKTTKLWIL 
RIGTQVLKTMSLYMAISPKFTTSLSLHKLLQTLVLKMLHSSSLTSLLKTH 
RMCKYTQSTALQELLIQQWIQFMMSRRRLLACLCKHKKVSTNLCTHSFRK 
KQVR (SEQ ID NO: 67) 

FIGURE 19 



MFHLVDFQVT I AE I L 1 1 IMRTFRI AI WNLDVI I S S I VRQLFKPLTKKNYS 
ELDDEEPMELDYP (SEQ ID NO : 68) 

FIGURE 20 



MKIILFLTLIVFTSCELYHYQECVRGTTVLIiKEPCPSGTYEGNSPFHPIiA 
DNKFALTCTSTHFAF AC ADGTRHTYQLRARSVS PKLF IRQEEVQQEL YS P 
LFLIVAALVFLILCFTIKRKTE (SEQ ID NO : 69) 

FIGURE 21 



MNELTLIDFYLCFLAFLLFLVIilMLIIFWFSLEIQDLEEPCTKV 

(SEQ ID NO: 70) 

FIGURE 22 



MKLL I VLTC I SLC SC I CTWQRCASNKPHVLEDPCKVQH 
(SEQ ID NO: 71) 

FIGURE 23 



MCLKILVRYNTRGNTYSTAWLCALGKVLPFHRWHTMVQTCTPNVTINCQD 
PAGGALIARCWYLHEGHQTAAFRDVLWIjNKRTN (SEQ ID NO: 72) 
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FIGURE 24 



MDPNQTNWPPALHLVDPQIQLTITRMEDAMGQGQNSADPKVYPIILRLG 
SQLSLSMARRNLDSLEARAFQSTPIWQMTKLATTEELPDEFVWTAK 

(SEQ ID NO: 73) 

FIGURE 25 



MLPPCYNFLKEQHCQKASTQREAEAAVKPLLAPHHWAVIQEIQLLAAVG 
EILLLEWLAEWKLPSRYCC (SEQ ID NO : 74) 



FIGURE 26 

CIAVGQLCVFWNIGRPCCSGLCVFA — CTVKL conotoxin 
C I SLCS - C I CT WQRC ASNKPHVIiEDPCKVQH s ar s 

•k -k , m ^ *. ;*«•• *; k ^ it » 



FIGURE 27 
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SEQUENCE LISTING 

<110> BC CANCER AGENCY 

<120> SARS VIRUS NUCLEOTIDE AND AMINO ACID SEQUENCER AND USES THEREOF 
<130> 82936-7 
<160> 206 

<170> Patentin version 3.3 

<210> 1 

<211> 29736 

<212> .DNA 

<213> Severe acute respiratory syndrome virus 



<40p> 1 

ctacccagga aaagccaacc aacctcgatc tcttgtagat 


ctgttctcta 


aacgaacttt 


PU 


aaaatctgtg 


tagctgtcgc 


tcggctgcat 


gcctagtgca 


cctacgcagt 


ataaacaata 


120 


ataaatttta 


ctgtcgttga 


caagaaacga 


gtaactcgtc 


cctctt ctgc 


agactgctta 


Ton 
1 o U 


cggtttcgtc 


cgtgttgcag 


tcgatcatca 


gcatacctag 


guT-i.cgTi,ccg 


ggngiigaccg 




aaaggtaaga 


tggagagcct 


tgttcttggt 


gtcaacgaga 


a a *^ ^ 
aoLcLcacacg^ 


ccaacucagTi 


-3UU 


ttgcctgtcc 


ttcaggttag 


agacgtgcta 


gtgcgtggct 


L. cggggac u c 


tyuggaagag 




gccctatcgg aggcacgtga 


acacctcaaa- 


aatggcactt 


y cgy tct ay 


ju rt Si /T/^4~ /^/^ ja 

ciy cty G uy gcio- 


^ u 


aaaggcgtac 


tgccccagct 


tgaacagccc 


tatgtgttca 


"h 3 a ^5 r»rT+" o 




•3 O C 


agcaccaatc 


acggccacaa 


ggtcgttgag 


ctggttgcag 


aaatcrerar«cfcr 


ca 1 1 c a cf +" J3. c 


540 


ggtcgtagcg gtataacact gggagtactc gtgccacatg 


taaccGcsaaao 

W \J Cit V** 


cccaattaGa 

V*' Vrf w> V« 


600 


taccgcaatg 


ttct'tcttcg 


taagaacggt 


aataagggag 


ccggtggtca 


tagctatggc. 


660 


atcgatctaa 


agtcttatga 


cttaggtgac 


gagcttggca 


ctgatcccat 


tgaagattat. 


720 


gaacaaaact 


ggaacactaa. gcatggcagt 


ggtgcactcc 


gtgaactcac 


tcgtgagctc 


780 


aatggaggtg 


cagtcactcg 


ctatgtcgac 


aacaatttct 


gtggcccaga 


tgggtaccct 


840 


cttgattgca 


tcaaagattt 


tctcgcacgc 


gcgggcaagt 


caatgtgcac 


• tctttccgaa 


900 


caacttgatt 


acatcgagtc 


gaagagaggt 


gtctactgct 


gccgtgacca 


tgagcatgaa . 


960 


attgcctggt 


tcactgagcg 


ctctgataag 


agctacgagc 


accagacacc 


cttcgaaatt 


1020 


aagagtgcca 


agaaatttga 


cactttcaaa 


ggggaatgcc 


caaagtttgt 


gtttcctctt 


1080 


aactcaaaag 


tcaaagtcat 


tcaaccacgt 


gttgaaaaga 


aaaagactga 


gggtttcatg 


1140 


gggc^tatac 


gctctgtgta 


ccctgttgca 


tctccacagg 


agtgtaacaa 


tatgcacttg 


1200 


tctaccttga tgaaatgtaa 


tcattgcgat 


gaagtttcat 


ggcagacgtg 


cgactttctg 


1260 


aaagccactt 


gtgaacattg 


tggcactgaa 


aatttagtta 


ttgaaggacc 


tactacatgt 


1320 
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gggtacctac. 


ctactaatgc 


tgtagtgaaa 


atgccatgtc 


ctgcctgtca 


agacccagag 


1380 


attggacctg 


agcatagtgt 


tgcagattat 


cacaaccact 


caaacattga 


aactcgactc 


1440 


cg'caagggag. 


gtaggactag 


atgttttgga 


ggctgtgtgt 


ttgcctatgt 


tggctgctat 


1500 


aataagcgtg 


cctactgggt 


tcctcgtgct 


agtgctgata 


ttggctcagg 


ccatactggc 


1560 


attactggtg 


acaatgtgga 


gaccttgaat 


gaggatctcc 


ttgagatact 


gagtcgtcraa 


1620 


cgtgttaaca 


ttaacattgt 


tggcgatttt 


catttgaatg 


aagaggttgc 


catcattttg 


1680 


gcatctttct 


ctgcttctac 


aagtgccttt 


attgacacta 


taaagagtct 


tgattacaag 


1740 


tctttcaaaa 


ccattgttga 


gtcctgcggt 


aactataaag- 


ttaccaaggg 


aaagcccgt a 


1800 


aaaggtgctt 


ggaacattgg 


acaacagaga 


tcagttttaa 


caccactgtg 


tggttttccc 


1860 


tcacaggctg 


ctggtgttat 


cagatcaatf 


tttgcgcgca 


cacttgatgc 


agcaaaccac 


192.0 


tcaattcctg. 


atttgcaaag 


agcagctgtc" 


accatacttg 


atggtatttc 


tgaacagtca 


1980 


ttacgtcttg 


tcgacgccat 


ggtttatact 


tcagacctgc 


tcaccaacag 


tgtcattatt 


2040 


atggcatatg 


taactggtgg 


tcttgtacaa 


cagacttctc 


agtggttcjtc 


taatcttttg 


2100 


ggcactactg 


ttgaaaaact 


caggcctatc 


tttgaatgga 


ttgaggcgaa 


acttagtgca 


2160 


ggagttgaat 


ttctcaagga 


tgcttgggag 


attctcaaat 


ttctcattac 






gacatcgtca 


agggtcaaat 


acaggttgct 


tcagataaca 


tcaaaaatta 


t Qtaaaatcrc 


2280 


ttcattgatg 


ttgttaacaa 


ggcactcgaa 


atgtgcattg 


atcaagtcac 


tatcgctggc 


2340 


gcaaagttgc 


gatcactcaa 


cttaggtgaa 


gtcttcatcg 


ctcaaagcaa 


gggactttac 


2400 


cgtcagtgta 


tacgtggcaa 


ggagcagctg 


caactactca 


tgcctcttaa 


.ggcaccaaaa 


2460 


gaagtaacct 


ttcttgaagg 


tgattcacat 


gacacagtac 


ttacctctga 


ggaqgttqtt 


2520 


ctcaagaacg 


gtgaactcga 


agcactcgag 


acgcccgttg 


atagcttcac 


aaatggagct 


2580 


atcgtcggca' 


caccagtctg 


tgtaaatggc. 


ctcatgctct 


tagagattaa 


ggacaaagaa 


2640 


caatactgcg 


cattgtctcc 


tggtttactg 


gctacaaaca 


atgtctttcg 


cttaaaaggg 


2700 


ggtgcaccaa 


ttaaaggtgt 


aacctttgga 


gaagatactg 


tttqggaagt 


tcaaggttac 


2760 


aagaatgtga 


gaatcacatt 


tgagcttgat 


gaacgtgttg 


acaaagtgct 


taatgaaaag 


2820 


tgctctgtct 


acactgttga 


atccggtacc 


gaagttactg 


agtttgcatg 


tgttgtagca 


2880 


gaggctgttg 


tgaagacttt 


acaaccagtt 






yygi-a u uga u 


4 0 


cttgatgagt 


ggagtgtagc 


tacattctac 


ttatttgatg 


atgctggtga 


agaaaacttt ' 


3000 


tcatcacgta 


tgtattgttc 


cttttaccct 


ccagatgagg 


aagaagagga 


cgatgcagag 


3060 


tgtgaggaag 


aagaaattga 


tgaaacctgt 


gaacatgagt 


acggtacaga 


ggatgattat 


3120 
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caaggtctcc 


ctctggaatt 


tggtgcctca 


. gctgaaacag 


ttcgagttga 


ggaagaagaa 


3180 


gaggaagact ■ 


ggctggatga 


tactactgag 


ca.atcagaga 


ttgagccaga 


accagaacct 


• 3240 


acacctgaag 


aaccagttaa 


tcagtttact 


ggftatttaa 


aacttactga 


caatgttgcc ' 


* 3300 


attaaatgtg 


ttgacatcgt 


taaggaggca 


caaagtgcta 


atcctatggt 


gattgtaaat ' 


3360 


gctgctaaca 


tacacctgaa 


acatggtggt 


ggtgtagcag 


gtgcactcaa 


caaggcaacc 


3420 


aatggtgcca 


tgcaaaagga 


gagtgatgat 


tacattaagc 


taaatggccc 


' tcttacagta 


3480 


ggagggtctt 


gtttgctttc 


tggacataat 


cttgctaaga 


agtgtctgca 


tgttgt'tgga 


3540 


cctaacctaa 


atgcaggtga 


ggacatccag 


cttcttaagg 


cagcatatga 


aaatttcaat 


3600 


tcacaggaca 


tcttacttgc 


accattgttg 


tcagcaggca 


tatttggtgc 


taaaccactt 


3660 


cagtctttac 


aagtgtgcgt 


gcagacggtt 


cgtacacagg 


tttatattgc 


agtcaatgac 


3720 


aaagctcttt ■ 


atgagcaggt 


tgtcatggat 


tatcttgata 


acctgaagcc 


tagagtggaa 


3180 


gcacctaaac 


aagaggagcc 


accaaacaca 


gaagattcca 


aaactgagga 


gaaatctgtc 


3840 


gtacagaagc 


ctgtcgatgt 


gaagcca'aaa 


attaaggcct 


gcattgatga 


ggttaccaca 


3900 


acactggaag 


aaactaagtt 


tcttaccaat 


aagttactct 


tgtttgctga. 


tatcaatggt 


3960 


aagctttacc 


atgattctca 


gaacatgctt 


agaggtgaag 


atatgtcttt 


ccttgagaag ' 


4020' 


gatgcacctt 


acatggtagg 


tgatgttatc 


actagtggtg 


atatcacttg 


tgttgtaata 


4080 


ccctccaaaa 


aggctggtgg 


cact'actgag 


atgctctcaa 


gagctttgaa 


gaaagtgcca 


4140 


gttgatgagt 


atataaccac 


gtaccctgga 


caaggatgtg 


ctggttatac 


acttgaggaa 


• 4200 


gctaagactg 


ctcttaagaa 


atgcaaatct 


gcattttatg 


tactaccticc 


agaagcacct 


4260 


^atgctaagg 


aagagattct 


aggaactgta 


tcctggaatt 


tgagagaaat 


gcttgctcat 


4320 


gctgaagaga 


caagaaaatt 


aatgcctata 


tgcatggatg 


ttagagccat 


aatggcaacc 


4380 


atccaacgta 


agtataaagg 


aattaaaatt 


caagagggca 


tcgttgacta 


tggtgtccga 


• 4440 


ttcttctttt 


atactagtaa 


agagcctgta 


gcttctatta 


ttacgaagct 


gaactctcta 


4500 


aatgagccgc- 


ttgtcacaat 


gccaattggt 


tatgtgacac 


atggttttaa 


tcttgaagag 


4560 


gctgcgcgct 


gtatgcgttc 


tcttaaagct 


cctgccgtag 


tgtcagtatc 


atcaccagat . 


4620 


gctgttacta 


catataatgg 


atacctcact 


tcgtcatcaa 


agacatctga 


ggagcacttt 


4680 


gtagaaacag 


tttctttggc 


tggctcttac 


agagattggt 


cctattcagg 


acagcgtaca . 


4740 


gagttaggtg 


ttgaatttct 


taagcgtggt 


gacaaaattg 


tgtaccacac 


tctggagagc 


4800 


cccgtcgagt 


ttcatcttga 


cggtgaggtt 


ctttcacttg 


acaaactaaa 


gagtctctta 


4860 


tccctgcggg 


aggttaagac 


tataaaagtg 


ttcacaactg 


tggacaacac 


taatctccac 


4920 


acacagcttg 


tggatatgtc 


tatgacatat 


ggacagcagt 


ttggtccaac 


atacttggat 


4980 
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ggtgctgatg 


ttacaaaaat- 


taaacctcat 


gtaaatcatg 


agggtaaigac 


tttctttgta 


5040 


ctacctagtg 


atgacacact 


acgtagtgaa 


■gctttcgagt 


actaccatac 


tcttgatgag 


5100 


agttttcttg. 


gtaggtacat 


gtctgcttta 


aaccacacaa 


agaaatggaa 


atttcctcaa 


^5160 


gttggtggtt 


taacttcaat 


baaatgggct 


gataacaatt 


gttatttgtc 


tagtgtttta 


5220 


ttagcacttc 


aacagcttga 


agtcaaattc 


aatgcaccag 


cacttcaaga 


ggcttattat 


5280 


agagcccgtg 


ctggtgatgc 


tgctaacttt 


tgtgcactca 


tactcgctta 


cagtaataaa 


5340 


actgttggcg 


agcttggtga 


tgtcagagaa 


actatgaccc 


atcttctaca 


gcatgctaat 


5400 


ttggaatctg 


caaagcgagt 


tcttaatgtg 


gtgtgtaaac 


attgtggtca 


gaaaactact 


' 5460 


acctta'acgg 


gtgtagaagc 


tgtgatgtat 


atgggtactc 


tatcttatga 


taatcttaag 


5520 


acaggtgttt 


ccattccatg 


tgtgtgtggt 


cgtgatgcta 


cacaatatct 


agtacaacaa 


• 5580 


gagtcttctt ■ 


ttgttatgat 


gtctgcacca 


cctgctgagt 


ataaattaca 


gcaaggtaca 


5640 


ttcttatgtg 


cgaatgagta 


cactggtaac 


tatcagtgtg 


gtcattacac 


tcatataact 


5700 


gctaaggaga 


ccctctatcg 


tattgacgga 


gctcacctta 


caaagatgtc 


agagtacaaa 


5760 


ggaccagtga 


ctgatgtttt 


ctacaaggaa 


acatcttaca 


ctacaaccat 


caagcctgtg 


5820 


tcgtataaac 


tcgatggagt 


tacttacaca 


gagattgaac. 


caaaattgga 


tgggtattat 


5880 


aaaaaggat.a 


atgcttacta 


tacagagcag 


cctatagacc 


ttgtaGcaac 


tcaaccatta 


5940 


ccaaatgcga 


gttttgataa 


tttcaaactc 


acatgttcta 


acacaaaatt 


tgctgatgat 


6000 


ttaaatcaaa 


tgacaggctt 


cacaaagcca 


gcttcacgag 


agctatctgt 


cacattcttc 


6060 


ccagacttga 


atggcgatgt 


agtggctatt 


gactatagac 


actattcagc 


gagtttcaag 


6120 


aaaggtgcta 


aattactgca'' 


taagccaatt 


gtttggcaca 


ttaaccaggc 


tacaaccaag 


6180 


acaacgttca 


aaccaaacac 


ttggtgttta 


cgttgtcttt 


ggagtacaaa 


gccagtagat 


6240 


acttcaaatt 


catttgaagt 


tctggcagta 


gaagacacac 


aaggaatgga 


eaatcttgct 


6300 


tgtgaaagtc 


aacaacccac 


ctctgaagaa 


gtagtggaaa 


atc'ctaccat 


acagaaggaa 


6360 


gtcatagagt 


gtgacgtgaa 


aactaccgaa 


gttgtaggca 


atgtcatact 


taaaccatca 


6420 


gatgaaggtg 


ttaaagtaac 


acaagagtta 


ggtcatgagg 


atcttatggc 


tgcttatgtg 


6480 


gaaaacacaa 


gcattaccat 


taagaaacct 


aatgagcttt 


cactagcctt 


aggtttaaaa 


6540 


acaattgcca 


ctcatggtat 


tgctgcaatt 


aatagtgttc 


ct tcraa rrf" ?ia 

w v» \^ a ^ i_ t-4 
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tatgtcaa'ac 


cattcttagg 


acaagcagca 


attacaacat 


caa-attgcgc 


taagagatta 


6660 


gcacaacgtg 


tgtttaacaa 


ttatatgcct 


tatgtgttta 


cattattgtt 


ccaattgtgt 


6720 


acttttacta 


aaagtaccaa 


ttctagaatt 


agagcttcac 


tacctacaac 


tattgctaaa 


6780 
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aatagtgtta 


agagtgttgc 


•taaattatgt 


ttggatgccg 


gcattaatta 


tgtgaagtca 


6840 


cccaaatttt 


ctaaattgtt 


cacaatcgct 


atgtggctat 


tgttgttaag 


tatttgctta 


6900 


ggttctctaa 


tctgtgtaac 


tgctgctttt 


ggtgtactct 


tatctaattt 


tggtgctcct 


• 6960 


tcttattgta 


atggcgttag 


agaattgtat 


cttaattcgt 


ctaacgttac 


tactatggat 


7020 


ttctgtgaag 


gttcttttcc 


ttgcagcatt 


tgtttaagtg 


gattagactc 


ccttgattct 


7080 


tatccagctc 


ttgaaaccat 


tcaggtgacg 


atttcatcgt 


acaagctaga 


cttgacaatt 


7140 


ttaggtctgg 


ccgctgagtg 


ggttttggca 


tatatgttgt 


tcacaaaatt 


ctttta'ttta 


7200 


ttaggtcttt 


cagctataat 


gcaggtgttc 


tttggctatt 


ttgctagtca 


tttcatcagc 


7260 


aattcttggc 


tcatgtggtt 


tatcattagt 


attgtacaaa 


tggcacccgt 


ttctgcaatg 


7320 


gttaggatgt 


acatcttctt 


tgcttctttc 


tacta'catat 


ggaagagcta 


tgttcatatc 


7380 


atggatggtt 


gcacctcttc 


gacttgcatg 


atgtgctata 


agcgcaatcg 


tgccacacgc 


7440- 


gttgagtgta 


caactattgt 


taatggcatg 


aagagatctt 


tctatgtcta 


tgcaaatgga 


7500 


ggccgtggct 


tctgcaagac 


tcacaattgg 


aattgtctea 


attgtgacac 


attttgcact 


7560 


ggtagtacat 


tcattagtga 


tgaagttgqt 


cgtgatttgt 


cactccagtt 


taaaagacca 


7620 


atcaacccta 


ctgaccagtc 


atcgtatatt 


gt-tgatagtg 


ttgctgtgaa 


aaatggcgcg 


• 7680 


ct't cacctct 


actttgacaa 


ggctggtcaa 


aagacctatg 


agagacatcc 


gctctcccat 


7740 


tttgtcaatt 


tagacaattt 


gagagctaac 


aacactaaag 


gttcactgcc 


tattaatgtc 


7800 


atagtttttg 


atggcaagtc 


caaatgcgac 


gagtctgctt 


ctaagtctgc 


ttctgtgtac 


7860 


tacagtcagc 


tgatgtgcca' 


acctattctg 


ttgcttgacc 


aagctcttgt 


atcagacg-tt 


7920 


ggagatagta 


ctgaagtt'tc 


cgttaagatg 


tttgatgctt 


atgtcgacac 


cttttcagca 


7980 


acttttagtg 


ttcctatgga 


aaaacttaag 


gcacttgttg 


ctacagctca 


cagcgagtta ' 


8040 


gcaaagggtg 


tagctttaga. 


tggtgtcctt 


tctacattcg 


tgtcagctgc 


ccgacaaggt 


■ 8100 


gttgttgata 


ccgatgttga 


cacaaaggat 


gttattgaat 


gtctcaaact 


ttcacatcac 


8160 


tctgacttag 


aagtgacagg 


tgacagttgt 


aacaatttca 


tgctcaccta 


taataaggtt 


8220 


gaaaacatga 


Ggcccagaga 


tcttggcgca 


tgtattgact 


gtaatgcaag 


gcatatcaat 


. 8280 


gcccaagtag 


caaaaagtca 


caatgtttca 


ctcatctgga 


atgtaaaaga 


ctacatgtct 


8340 


ttatctgaac 


agctgcgtaa 


acaaattcgt 


agtgctgcca 


agaagaacaa 


catacctttt 


. 8400 


agactaactt 


gtgctacaac 


tagacaggtt 


gtcaatgtca 


taactactaa 


aatctcactc 


8460 


aagggtggta 


agattgttag 


tacttgtttt 


aaacttatgc 


ttaaggccac 


attattgtgc 


8520 


gttcttgctg 


cattggtttg 


ttatatcgtt 


atgccagtac 


atacatt-gtc 


aatccatgat 


8580 


ggttacacaa 


atgaaatcat 


tggttacaaa 


gccattcagg 


atggtgtcac 


tcgtgacatc 


8640 
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atttctactg 


atgattgttt 


tgcaaataaa 


catgctggtt- 


ttgacgcatg 


gtttagccag 


8700 


cgtggtggtt 


catacaaaaa 


tgacaaaagc 


tgccctgtag 


tagctgctat 


cattacaaga 


87 60 


gagattggtt 


tcatagtgcc 


tggcttaccg 


ggtactgtgc 


tgagagcaat 


caatggtgac 


8820 


ttcttgcatt 


ttctacctcg 


tgtttttagt 


gctgttggca 


acatttgcta 


cacaccttcc 


8880 


aaactcattg 


agtatagtga 


ttttgctacc 


tctgcttgcg 


ttcttgctgc 


tgagtgtaca 


8940 


atttttaagg 


atgctatggg 


caaacctgtg 


ccatattgtt 


atgacactaa 


tttgctagag 


9000 


ggttctattt 


cttatagtga 


gcttcgtcca 


gacactcgtt 


atgtgcttat 


ggatggttcc 


. 9060 


atcatacagt 


ttcctaacac 


ttacctggag 


ggttctgtta 


gagtagtaac 


aacttttgat 


9120 


gctgagtact 


gtagacatgg 


tacatgcgaa 


aggtcagaag 


taggtatttg 


cctatctacc 


9180 


agtggtagat 


gggttcttaa 


taatgagcat 


tacagagctc 


tatcaggagt 


tttctgtggt 


9240 


gttgatgcga 


tgaatctcat 


agctaacatc 


tttactcctc 


ttgtgcaacc 


tgtgggtgct 


9300 


ttagatgtgt 


ctgcttcagt 


agtggctggt 


ggtattattg 


ccatattggt 


gacttgtgtrt 


93*60 


gcctactact 


ttatgaaatt 


cagacgtgtt 


tttggtgagt 


acaaccatgt 


tgttgctgct 


9420 


aatgcacttt . 


tgtttttgat 


gtctttcact 


atactctgtc 


tggtaccagc 


ttacagcttt 


9480 


ctgccgggag 


tctactcagt 


cttttacttg 


tacttgacat 


tctatttcac 


caatgatgtt 


9540 


■tcat.-tct.t.gg 


ctcaccttca 


atggtttgcc 


atgttttctc 


ctattgtgcc 


tttttggata 


• 9600 


acagcaatct 


atgtattctg 


tatttctctg 


aagcactgcc 


at'tggttctt 


taacaactat • 


■ 9660 


cttaggaaaa 


gagtcatgtt 


taatggagtt 


acatttagta 


ccttcgagga 


ggctgctttg 


'9720 


tgtacctttt 


tgctcaacaa 


ggaaatgtac 


ctaaaattgc 


gtagcgagac 


actgttgcca 


9780 


cttacacagt 


ataacaggta 


tcttgctcta 


tataacaagt 


acaagtattt 


cagtggagcc 


9840 


ttagatacta 


ccagctatcg 


tgaagcagct 


tgctgccact 


tagcaaaggc 


tctaaatgac 


9900 


tttagcaact 


caggtgctga 


tgtt'ctctac 


caaccaccac 


agacatcaat 


cacttctgct 


9960 


gttctgcaga 


gtggttttag 


gaaaatggca 


ttcccgtcag 


gcaaagttga 


sgggtgcatg 


10020 


gtacaagtaa 


cctgtggaac 


tacaactctt 


aatggattgt 


ggttggatga 


cacagtatac 


10080 


tgtccaagac 


atgtcatttg 


cacagcagaa 


' gacatgctta 


atcctaacta 


tgaagatctg 


10140 


ctcattcgpa 


aatccaacca 


tagctttctt 


gttcaggctg 


gcaatgttca 


acttcgtgtt 


10200 


attggccatt 


ctatgcaaaa 


ttgtctgctt 


aggcttaaag 


ttgatacttc 


taaccct aacf 

fc» w* U- d (A 


10260 


acacccaagt 


ataaatttgt 


ccgtatccaa 


cctggtcaaa 


cattttcagt 


tctagcatgc 


10320 


tacaatggtt 


caccatctgg 


tgtttatcag 


tgtgccatga 


gacctaatca 


taccattaaa 


10380 


ggttctttcc 


ttaatggatc 


atgtggtagt 


gttggtttta 


acattgatta 


tgattgcgtg 


10440 



6 



wo 2004/096842 PCT/CA2004/000626 



tctttctgct 


atatgcatpa 


tatggagctt 


ccaacaggag 


tacacgctgg 


tactgactta 


10500 


gaaggtaaat 


tctatggtcc 


atttgttgac 


agacaaactg 


cacaggctgc 


aggtaoagac 


10560 


acaaccataa 


cattaaatgt tttggcatgg 


ctgtatgctg 


ctgttatcaa 


tggtgatagg 


10620 


tggtttctta 


atagattcac 


cactactttg 


^atgacttta 


accttgtggc 


aatgaagtac 


10680 


aactatgaac 


ctttgacaca 


agatcatgtt 


gacatattgg 


gacctctttc 


tgctcaaaca 


10740 


ggaattgccg 


tcttagatat 


gtgtgctgct 


ttgaaagagc 


tgctgcagaa 


tggtatgaat 


10800 


ggtcgtacta 


tccttggtag 


cactatttta 


gaagatgagt 


.ttacaccatt 


tgatgttgtt 


10860 


agacaatgct 


ctggtgttac cttccaaggt 


aagt.t.caaga 


aaattgttaa 


gggcactcat 


10920 


cattggatgc 


ttttaacttt 


cttgacatca 


ctattgattc 


ttgttcaaag 


tacacagtgg 


10980 


tcactgtttt 


tctttgttta 


cgagaatgct 


ttcttgccat 


ttactcttgg 


tattatggca 


11040 


attgctgcat 


gtgctatgct 


gcttgttaag 


cataagcacg 


cattcttgtg 


cttgtttctg 


11100 


ttaccttctc 


-ttgcaacagt 


tgcttacttt 


aatatggtct 


acatgcctgc 


tagctgggtg 


11160 


atgcgtatca 


tgacatggct 


tgaattggct 


gacactagct 


tgtctggtta 


taggcttaag 


11220 


gattgtgtta 


tgtatgcttc 


agctttagtt 


ttgcttattc 


tcatgacagc 


■tcgcactgtt ' 


11280 


tatgatgatg 


ctgctagacg 


tgtttggaca 


ctgatgaatg 


tcattacact 


tgtttacaaa 


11340 


gtctactatg 


gtaatgcttt 


agatcaagct 


atttccatgt 


gggccttagt 


tatttctgta 


11400 


acctctaact 


attctggtgt 


cgttacgact 


atcatgtttt 


tagctagagc 


tatagtgttt 


11460 


gtgtgtgttg 


agtattaccc 


attgttattt 


attactggca 


acaccttaca 


gtgtatcatg 


11520 


cttgtttatt 


gtttcttagg 


ctattg-ttgc 


-tgctgctacb 


"tt-ggccttt-t 


ctgtttactc 


11580 


aaccgttact 


tcaggcttac 


tcttggtgtt 


tatgactact 


tggtctctac 


acaagaattt 


11640 


aggtatatga 


actcccaggg 


gcttttgcct 


cctaagagta 


gtattgatgc 


tttcaagctt 


11700 


aacattaagt 


tgttgggtat 


tggaggtaaa 


ccatgtatca 


aggttgctac 


tgtacagtct 


11760 


aaaatgtctg 


acgtaaagtg 


cacatctgtg 


gtactgctct 


cggttcttca 


acaacttaga 


11820 


gtagagtcat 


cttctaaatt 


gtgggcacaa 


tgt'gtacaac 


tccacaatga 


tattcttctt 


11880 


gcaaaagaca 


caactgaagc 


tttcgagaag 


atggtttctc 


ttttgtctgt 


tttgctatcc 


11940 


atgcagggtg 


ctgtagacat 


taataggttg 


tgcgaggaaa 


tgctcgataa 


ccgtgctact 


. 12000 


cttcaggcta 


ttgcttcaga 


atttagttct 


ttaccatcat 


atgccgctta 


tgccactgcc 


12060 


caggaggcct 


atgagcaggc 


tgtagctaat 


ggtgattctg 


aagtcgttct 


caaaaagtta 


12120 


aagaaatctt 


tgaatgtggc 


taaatctgag 


tttgaccgtg 


atgctgccat 


gcaacgcaag 


12180 


ttggaaaaga 


tggcagatca 


ggctatgacc 


caaatgtaca 


aacaggcaag 


atctgaggac 


12240 


aagagggcaa 


aagtaactag 


tgctatgcaa 


acaatgctct 


tcactatgct 


taggaagctt 


12300 
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gataatgatg 


cacttaacaa 


cattatcaac 


aatgcgcgtg 


atggttgtgt 


tccactcaac 


12360 


atcataccat 


tgactacagc 


agccaaactc 


atggttgttg 


tcGctgatta 


tggtacctac 


.12420 


aagaacactt 


gtgatggtaa 


cacctttaca 


•tatgcatctg 


cactctggga 


aatccagcaa 


12480 


gttgttgatg 


cggatagcaa 


gattgttcaa 


cttagtgaaa 


ttaacatgga 


caattcacca 


12540 


aatttggctt 


ggcctcttat 


tgttacagct 


ctaagagcca 


actcagctgt 


taaactacag 


12600 


aataatgaac 


tgagtccagt 


agcactacga 


cagatgt.cct 


gtgcggctgg 


taccacacaa 


12660 


aqagcttgta 


ctgatgacaa 


tgcacttgcc 


tactataaca 


attcgaaiggg 


aggtaggttt 


12720 


gtgctggcat 


tactatcaga 


Gcaccaagat 


ctcaaatggg 


ctagattccc 


taagagtgat 


12780 


ggtacaggta 


caatttacac 


agaactggaa 


ccaccttgta 


ggtttgttac 


agacacacca 


12840 


aaagggccta 


aagtgaaata 


cttgtacttc 


atcaaaggct 


taaacaacct 


aaatagaggt 


12900 


atggtgctgg 


gcagtttagc 


tgctacagta 


cgtcttcagg 


ctggaaatgc 


tacagaagta 


12960 


cctgccaatt 


caactgtgct 


ttccttctgt 


gcttttgcag 


tagaccctgc 


taaagcatat 


13020 


aaggattacc 


tagcaagtgg 


aggacaacca 


atcaccaact 


gtgtgaagat 


gttgtgtaca 


13080 


cacactggta 


caggacaggc 


aattactgta 


acaccagaag 


ctaacatgga 


ccaagagtcc 


13140 

* 


tttggtggtg 


cttcatgttg 


tctgtattgt 


agatgccaca 


ttgaccatcc 


■ aaatcctaaa 


132,00 


ggattct-gt-g 


acttgaaagg 


taagtacgtc 


caaataccta 


ccacttgtgc 


taatgaccca 


13260 


gtgggtt.tta 


cacttagaaa 


cacagtctgt 


accgtctgcg 


ga'atgtggaa 


aggttatggc • 


• 13320 


tgtagttgtg 


accaactccg 


cgaacccttg 


atgcagtctg 


cggatgcatc 


aacgttttta 


13380 


aacgggtttg 


cggtgtaagt 


gcagcccgtc 


ttacaccgtg 


cggcacaggc 


actagtactg 


13440 


atgtcgtcta 


cagggctttt 


gatatttaca 


acgaaaaagt 


tgctggtttt 


gcaaagttcc 


13500 


taaaaactaa 


ttgctgtcgc 


ttccaggaga 


aggatgagga 


aggcaattta 


ttagactctt 


13560 


actttgtagt 


taagaggcat 


actatgtcta 


actaccaaca 


tgaagagact 


atttataact 


13620 


tggttaaaga 


ttgtccagcg 


gttgctgt'cc 


atgacttttt 


caagtttaga 


gtagatggtg 


13680 


acatggtacc 


acatatatca 


cgtcagcgtc 


taactaaata 


cacaatggct 


gatttagtct 


13740 


atgctctacg 


tcattttgat 


gagggtaatt 


gtgatacatt 


aaaagaaata 


ctcgtcacat 


13800 


acaattgctg 


tgatgatgat 


tatttcaata 


agaaggattg 


gtatgacttc 


gtagagaatc 


13860 


ctgacatctt 


acgcgtatat 


gctaacttag 


atoacrcatat 


acaccaat'ca 

Vig ^^uL ^ V** 
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ctgtacaatt 


ctgcgatgct 


atgcgtgatg 


caggcattgt 


aggcgtactg 


acattagata 


13980 


atcaggatct 


taatgggaac 


tggtacgatt 


tcggtgattt 


cgtacaagta 


gcaccaggct 


14040 


gcggagttcc 


tattgtggat 


tcatattact 


cattgctgat 


gcccatcctc 


actttgacta 


14100 
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gggcattggc 


tgctgagtcc 


catatggatg 


atttgctgaa 


atatgatttt 


acggaagaga 


attgggacca 


gacataccat 


cccaattgta 


attgtgcaaa 


ctttaatgtg 


ttattttcta 


tagtaagaaa 


aatatttgta 


gatggtgttc 


gtgagttagg 


agtcgtacat 


aatcaggatg 


aggaactttt 


agtgtatgct 


gctgatccag 


tagataaacg 


cactacatgc 


ttttcagtag 


ctgtcaaacc 


cggtaatttt 


aataaagact 


ttaaggaagg 


aagttctgtt 


gaactaaaac. 


ctatcagtga 


ttatgactat 


tatcgttata 


tcctattcgt 


agttgaagtt 


gttgataaat 


atgccaacca 


agtaatcgtt 


aacaatctgg 


ggggtaaggc 


tagactttat 


tatgactcaa 


cgtatactaa 


gcgtaatgtc 


atccctacta 


gtgcaaagaa 


. tagagctcgc 


accgtagctg 


gacagtttca 


tcagaaatta 


ttgaagtcaa 


ttggaacaag 


caagttttac 


ggtggctggc 


tagaaactcc 


acaccttatg 


ggttgggatt 


tgcttaggat 


aatggcctct 


cttgttcttg 


cacaccgttt 


ctacaggtta 


gctaacgagt 


gtggcggctc 


actatatgtt 


aaaccaggtg 


atgctaatag 


tgtctttaac 


atttgtcaag 


caactgatgg 


taataagata 


gctgacaagt 


agtgtctcta 


tagaaatagg 


gatgttgatc 


tgcgtaaaca 


tttctccatg 


atgattcttt 


actatgcggc 


tcaaggttta 


gtagctagca 


aaaataatgt 


gttcatgtct 


gaggcaaaat 


ctcacgaatt 


ttgctcacag 


catacaatgc 


tgccttaccc 


agatccatca 


agaatattag 


aaacagatgg 


tacacttatg 


attgaaaggt 
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ctgatctcgc 


aaaaccactt 


attaagtggg 


14160 


gactttgtct 


cttcgaccgt 


tattttaaat 


14220 


ttaactgttt 


ggatgatagg 


Tigtatccttc 


14280 


ctgtgtttcc 


acctacaagt 


tttggaccac 


14340 


cttttgttgt ttcaactgga 


taccattttc 


14400 


taaacttaca 


tagctcgcgt 


ctcagtttca 


14460 


ctatgcatgc 


.agcttctggc 


aatttattgc 


14520 


ctgcactaac 


aaacaatgtt 


gct.t.t.-tcaaa 


14580 


tttatgactt 


tgctgtgtct 


aaaggtttct 


14640 


acttcttctt 


tgctcaggat 


ggcaacgctg 


14700 


atctgccaac aatgtgtgat 


atcagacaac 


14760 


actttgattg 


ttacgatggt 


ggctgtatta 


14820 


ataaatcagc 


tggtttccca 


tttaataaat 


14880 


tgagttatga 


ggatcaagat 


■gcacttttcg' 


14940 


taactcaaat 


gaatcttaag 


tatgccatta 


15000 


gtgtctctat 


ctgtagtact 


atgacaaata 


15060 


tagccgccac 


tagaggagct 


actgtggtaa 


15120 


ataatatgtt 


aaaaactgtt 


tacagtgatg 


15180 


atccaaaatg* -tgaGagagcc 


atgcctaaca 


15240 


ctcgcaaaca 


taacacttgc 


tgtaacttat 


15300 


gtgcgcaagt 


attaagtgag 


atggtcatgt 


15360 


gaacatcatc 


cggtgatgct 


acaactgctt 


15420 


ctgttacagc 


caatgtaaat 


gcacttcttt 


15480 


atgtccgcaa 


tctacaacac 


aggctctatg 


15540 


atgaattcgt 


ggatgagttt 


tacgcttacc" 


15600 


ctgatgatgc 


cgttgtgtgc 


na uaacagta 


loooO 


ttaagaactt 


taaggcagtt 


ctttat tatc 


1 c T o r» 
Id / ZU 


gttggactga 


gactgacctt 


actaaaggac 


T C T O A 

Xo / O 0 


tagttaaaca 


aggagatgat 


tacgtgtacc 


15840 


gcgcaggctg 


ttttgtcgat 


gatattgtca 


15900 


tcgtgtcact 


ggctattgat 


gcttacccac 


15960 
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ttacaaaaca 


tcctaatcag 


gagtatgctg 


atgtctttca 


cttgtattta 


caatacatta 


16020 


gaaagttaca 


tgatgagctt 


actggccaca 


■tgt-tggacat. 


gtattccg-ta 


atgctaaeta 


16080 


atgataacac 


ctcacggtac 


tgggaacctg 


agttttatga 


ggctatgtac 


acaccacata 


16140 


cagtcttgca 


ggctgtaggt 


gcttgtgtat 


tgtgcaattc 


acagacttca 


cttcgttgcg 


16200 


gtgcctgtat 


taggagacca 


ttcctatgtt 


gcaagtgctg 


ctatgaccat 


gtcatttcaa 


16260 


catcacacaa 


attagtgttg 


tctgtta.atc 


cctatgtttg 


caatgcccca 


gattataatQ 


.16320 


tcactgatgt 


gacacaactg 


tatctaggag 


gtatgagcta 


ttattgcaag 


"bcacafcaagc 


16380 


ctcccattag 


ttttccatta 


tgtgctaatg 


gtcaggtttt 


tggtttatac 


aaaaacacat 


16440 


gtgtaggcag 


tgacaatgtc 


actgacttca 


atgcgatagc 


aacatgtgat 


tggact aat g 


16500 


ctggcgatta 


catacttgcc 


aacacttgta 


ctgagagact 


caagcttttc. 


gcagcagaaa 


16560 


cgctcaaagc 


cactgaggaa 


acatttaagc 


tgtcatatgg 


tattgccact 


gtacgcgaag 


16620 


tactctctga 


cagagaattg 


catctttcat 


gggaggttgg 


aaaacctaga 


ccaccattga 


16680 


acagaaacta 


tgtctttact 


ggttaccgtg 


•taactaaaaa 


tagtaaagta 


cagatt ggag 


16740 


agtacacctt 


tgaaaaaggt 


gactatggtg 


atgctgttgt 


gtacagaggt 


actacgaca t 


16800 

< 


acaagttgaa 


tgttggtgat 


tactttgtgt 


tgacatctca 


cactgtaatg 


ccacttagt g 


16860 


cacctactct 


agtgccacaa 


gagcactatg 


tgagaattac 


tggcttgtac 


ccaacactca 


• 16920 


acatctcaga 


tgagttttct 


,agcaatgttg 


caaattatcs 


aaaggtcggc 


atgcaaaagt 


16980 


actctacact 


ccaaggacca 


cctggtactg 


gtaagagtca 


ttttgccatc 


ggacttgctc 


17.040 


tctattaccc 


atctgctcgc 


atagtgtata 


cggcatgctc 


tcatgcagct 


gttgatgccc 


17100 


tatgtgaaaa 


ggcattaaaa 


tatttgccca 


tagataaatg 


tagtagaatc 


atacctgcgc 


17160 


gtgcgcgcgt 


agagtgtttt 


gataaattca 


aagtgaattc 


aacactagaa 


cagtatgttt 


17220 


tctgcactgt 


aaatgcattg 


. ccagaaacaa 


ctgctgacat 


tgtagtcttt 


gatgaaatct 


17280 


ctatggctac 


taattatgac 


.ttgagtgt'tg 


tcaatgctag 


acttcgtgca 


aaacactacg 


17340 


tctatattgg 


cgatcctgct 


caattaccag 


ccccccgcac 


attgctgact 


aaaggcacac 


17400 


tagaaccaga 


atattttaat 


tcagtgtgca 


gacttatgaa 


aacaataggt 


ccagacatgt 


17460 


tccttggaac 


ttgtcgccgt 


t'gtcctgctg 


aaattgttga 


cactgtgagt 


gctttagttt 


17520 


atcracaataa 






ra f^st <"» 4* /*• a 

cigi. ^ogcjuca 


a ugcLiicaaa 


atgttctaca 


*i T c o r\ 
17 O 8 0 


aaggtgttat 


tacacatgat 


gtttcatctg 


caatcaacag 


acctcaaata 


ggcgttgtaa 


17640 


gagaatttct 


tacacgcaat 


cctgcttgga 


gaaaagctgt 


ttttatctca 


ccttataatt 


17700 


cacagaacgc 


tgtagcttca 


aaaatcttag 


gattgcctac 


gcagactgtt 


gattcatcac 


17760 
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agggttctga 


atatgactat 


gtcatattca 


cacaaactac 


tgaaacagca 


cactcttgta 


.17 820 


atgtcaaccg* 


cttcaatgtg 


gctatcacaa 


gggcaaaaat 


tggcattttg 


tgcataatgt 


17880 


ctgatagaga 


tctttatgac 


aaactgcaat 


ttacaagtct 


agaaatacca 


cgtcgcaatg 


17940 


tggctacatt 


acaagcagaa 


aatgtaactg 


gactttttaa 


ggactgtagt 


aagaf catta 


18000 


ctggtcttca 


tcctacacag 


gcacctacac 


acctcagcgt 


tgatataaag 


ttcaagactg 


18060 


aaggattatg 


tgttgacata 


ccaggcatac 


caaaggacat 


gacctaccgt 


agactcatct 


18120 


ctatgatggg 


tttcaaaatg 


aattaccaag 


tcaatggtta 


.ccctaatatg 


tttatcaccc 


18180 


gcgaagaagc 


tattcgtcac 


gttcgtgcgt 


gga-ttggctt 


tgatgtagag 


ggctgtcatg 


18240 


caactagaga 


tgctgtgggt 


actaacctac 


ctctccagct 


aggattttct 


acaggtgtta 


18300 


acttagtagc 


tgtaccgact 


ggttatgttg. 


acactgaaaa 


taacacagaa 


ttcaccagag 


18360 


ttaatgcaaa 


acctccacca 


ggtgaccagt 


ttaaacatct 


tataccactc 


atgtataaag 


18420 


gcttgccctg 


gaatgtagtg 


cgtattaaga 


tagtacaaat 


gctcagtgat 


acactgaaag 


18480 


gattgtcaga 


cagagtcgtg 


ttcgtccttt 


gggcgcatgg 


ctttgagctt 


acatcaatga 


18540 


agtactttgt 


caagattgga 


cctgaaagaa 


cgtgttgtct 


gtgtgacaaa 


cgtgcaactt ' 


18600 


gcttttctac 


ttcatcagat 


acttatgcct 


gctggaatca 


ttctgtgggt 


tttgactatg 


18660 


tcta'taaccc ■ 


atttatgatt 


gatgttcagc 


agtggggctt 


tacgggtaac 


cttcagagta 


•18720 


accatgacca 


acattgccag 


gtacatggaa 


atgcacatgt 


ggctagttgt 


gatgctairca 


18780 


tgactagatg 


tttagcagtc 


catgagtgct 


ttgttaagcg 


cgttgattgg 


tctgttgaat 


18840 


accctattat 


s^'gagatgaa 


ctgagggtta 


at.t.<itgct.t.g 


■cagaaaag-ta 


caacacatgg 


1B900 


ttgtgaagtc 


tgcattgctt 


gctgataagt 


ttccagttct 


tcatgacatt 


ggaaatccaa 


18960 


aggctatcaa 


gtgtgtgcct 


caggctgaag 


tagaatggaa 


gttctacgat 


gctcagccat 


19020 


gtagtgacaa 


agcttacaaa 


atagaggaac 


tcttctattc 


ttatgctaca 


catcacgata 


19080 


aattcactga 


tggfcgtttgt 


ttgttttgga 


attgtaacgt 


tgatcgttac 


ccagccaatg 


19140 


caattgtgtg 


taggtttgac 


acaagagtct 


tgt'caaactt 


gaacttacca 


ggctgtgatg 


19200 


gtggtagttt 


gtatgtgaat 


aagcatgcat 


tccacactcc 


agctttcgat 


aaaagtgcat 


19260 


ttactaattt 


aaagcaattg 


cctttctttt 


actattctga 


tagtccttgt 


gagtctcatg 


19320 


gcaaacaagt 


agtgtcggat 


attgattatg 


ttccactcaa 


atctgctacg 


tgtattacac 


19380 


gatgcaattt 


aggtggtgct 


gtttgcagac 


accatgcaaa 


tgagtaccga 


cagtacttgg 


19440 


atgcatataa 


tatgatgatt 


tctgctggat 


ttagcctatg 


gatttacaaa 


caatttgata 


19500 


cttataacct. 


gtggaataca 


tttaccaggt 


tacagagttt 


agaaaatgtg 


gcttataatg 


19560 


ttgttaataa 


aggacacttt 


gatggacacg 


ccggcgaagc 


acctgtttcc 


atcattaata 


19620 
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atttaatgat 


aggcttagcc 


aagcgct cac 
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oatcttttac 
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U T D U 


aagatttgtc 


agtgatttca 


aaagtggtca 


acrcrttacaat 
1 


tcrant at rrr't 


era aatt+"oat 


9nR9 n 


tcatgctttg 


gtgtaaggat 


crcf a c a i" crt" f cr 


aaa cir't t ot a 
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ci d y C a a g X. C 


on RfiO 
^UOo U 


gagcgtggca 


accaggtgtt 


crccrat erect" a 


act tcrtar'aa 


ora t rrr*a a a rra 


a t nf r''!" 4- o+- 4" (*t 
ciL.y^Li L.^L. uy 


9 n fi4 n 


aaaagtgtga 


ccttcagaat 


tat ggt gaaa 


at{3'Gt crt t at 


a CO a a a a rr^ra 

Vm* d CL CL d \A \^ d 






atgtcgcaaa 


gtatactcaa 


ctgtgtcaat 


actt aaatac 


acttacttt a 


crct crt a ccr't 


907^0 


acaacatgag 


agttattcac 


tttaatacta 


gctctgataa 


aacraatt nca 




<£• V/ W fci V/ 


ctgtgctcag 


acaatggttg 


ccaactggca 


cactacttgt 


ccrattcacrat 


ctt aat rra ct 
Virf^v«ddL>ydWH> 




tcgtctccga' 


cgcatattct 


actttaattg 


gagactgtgc 


aacaCTt anat 


a Caere t aat a 


20940 


aatgggacct 


tattattagc 


gatatgtatg 


accctaggac 


caaacatgtg 


acaaaacrarra 


21000 


atgactctaa 


agaagggttt 


ttca.cttatc 


tcftotaaatt 


t ataaagc&a 


aaactaaccc 


21060 


tgggtggttc 


tatagctgta 


aagataacag 


agcattctt g 


gaatgctgac 


cttt acaacrc 


21120 


ttatgggcca 


tttctcatgg 


tggacagctt 


ttgttacaaa 


tgtaaatgca 


teat cat ccfcr 


21180 


aagcattttt 


aattggggct 


aactatcttg 


gcaagccgaa 


ggaacaaatt 


gatggctata 


21240 


ccatgcatgc 


taactacatt 


ttctggagga 


acacaaatcc 


tatccagttg 


tcttcctatt 


21300 


cactctttga 


catgagcaaa 


tttcctctta 


aattaagagg 


aactgctgta 


atgtctctta 


21360 


aggagaatca 


aatcaatgat 


atgatttatt 


ctcttctgga 


aaaaggtagg 


cttatcatta 


21420 
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gagaaaacaa 


cagagttgtg 


gttt'caagtg 


atattcttgt 


taacaactaa 


acgaacatgt 


21480 


ttattttctt 


attatttctt 


actctcacta 


gtggtagtga 


Gcttgaccgg 


tgcaccactt 


21540 


ttgatgatgt 


tcaagctcct 


aattacactc 


aacatacttc 


atctatgaigg 


gqggtttact 


-21600 


atcctgatga 


aatttttaga 


tcagacactc 


tttatttaac 


tcaggattta 


tttcttccat " 


21660 


tttattctaa 


tgttacaggg 


tttcatacta 


ttaatcatac 


gtttggcaac 


cctgtcatac 


21720 


cttttaagga 


tggtatttat 


tttgctgcca 


cagagaaatc 


aaatgttgtc ■ 


cgtggttggg 


21780 


tttttggttc 


taccatgaac 


aacaagtcac 


agtcggtgat 


tattattaac 


aattctacta 


21840 


atgttgttat 


acgagcatgt 


aactttgaat 


tgtgtgacaa 


ccctttcttt 


gctgttt.ct.a 


21900 


aacccatggg 


tacacagaca 


catactatga 


tattcgataa 


tgcatttaat 


tgcactttcg 


21960 


agtacatatc 


tgatgccttt 


tcgcttgatg 


tttca'gaaaa 


gtcaggtaat 


tttaaacact 


22020 


tacgagagtt 


tgtgtttaaa 


aataaagatg 


ggtttctcta 


tgtttataag 


ggctatcaac 


22080 


ctatagatgt 


agttcgtgat 


ctaccttctg 


gttttaacac 


tttgaaacct 


atttttaagt 


22140 


tgcctcttgg 


tattaacatt 


acaaatttta 


gagccattct 


tacagccttt 


tcacctgct'C 


22200 


aagacatttg 


gggcacgtca 


gctgcagcct 


attttgttgg 


ctatttaaag 


ccaactacat 


22260 


ttatgctcaa 


gtatgatgaa 


aatggtacaa 


tcacagatgc 


tgttgattgt 


tctcaaaatc 


'22320 


cacttgctga 


actcaaatgc 


tctgttaaga 


gctttgagat 


tgacaaagga 


atttaccaga 


22380 


cctctaattt 


cagggttgtt 


ccctcaggag 


atgttgtgag 


attccctaat 


attacaaact 


22440 


tgtgtccttt 


tggagaggtt 


tttaatgcta 


ctaaattccc 


ttctgtctat 


gcatgggaga 


22500 


gaaaaaaaat 


ttctaattgt 


qttgctgatt 


actctgtgct 


ctacaactca 


acattttt-tt 


22560 


caacctttaa 


gtgctatggc 


gtttctgcca 


ctaagttgaa 


tgatctttgc 


ttctccaatg 


22620 


tctatgcaga 


ttcttttgta 


gtcaagggag 


atgatgtaag 


acaaatagcg 


ccaggacaaa 


22680 


ctggtgttat 


tgctgattat. 


aattataaat 


tgccagatga 


tttcatgggt 


tgtgtccttg 


22740 


cttggaatac 


taggaacatt 


gatgctactt 


caactggtaa 


ttataattat 


aaatataggt 


22800 


atcttagaca 


tggcaagctt 


aggccctttg 


agagagacat 


atctaatgtg 


cctttctccc 


22860 


ctgatggcaa 


accttgcacc 


ccacctgctc 


ttaattgtta 


ttggccatta 


aatgattatg 


.22920 


gtttttacac 


cactactggc 


attggctacc 


aaccttacag 


agttgtagta 


ctttcttttg 


22980 


aacttttaaa 


tgcaccggcc 


acggtttgtg 


gaccaaaatt 


atccactgac 


cttattaaga 


23040 


accagtgtgt 


caattttaat 


tttaatggac 


tcactggtac 


tggtgtgtta 


actccttctt 


23100 


caaagagatt 


tcaaccattt 


caacaatttg 


gccgtgatgt 


ttctgatttc 


actgattccg 


23160 


■ttcgagatcc 


taaaacatct. 


gaaatattag 


acat.t.iLcacc 


ttgcgctttt. 


gggggtgtaa 


23220 


gtgtaattac 


acctggaaca 


aatgcttcat 


ctgaagttgc 


tgttctatat 


caagatgtta 


23280 
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gggaaaatat 


gagcaatata 


ttaaatggcc 


ttggtatgtt 


tggctcggct 


25080 
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tcattgctgg 


actaattgcc 


atcgtcatgg ttacaatctt 


gctttgttgc 


atgactagtt 


25140 


gttgcagttg 


cctcaagggt 


.gcatgctctt 


gtggttcttg 


ctgcaagttt 


gatgaggatg 


■ 25200 


actctgagcc 


agttctcaag 


ggtgtcaaat 


tacattacac 


ataaacgaac 


ttatggattt 


25260 


gtttatgaga 


ttttttactc 


ttggatcaat tactgcacag 


ccagtaaaaa 


ttgacaatgc" 


25320 


ttctcctgca 


agtactgttc 


atgctacagc 


aacgataccg 


ctacaagcct. 


cactcccttt 


25380 


cgtfatggctt 


gttattggcg 


ttgcatttct 


tgctgttttt 


cagagcgcta* 


ccaaaataat 


25440 


tgcgctcaat 


aaaagatggc 


agctagccct 


ttataagggc 


ttccagttca 


tttgcaattt 


25500 


actgctgcta 


tttgttacca 


tctattcaca 


tcttttgctt 


gtcgctgcag 


gtatggaggc 


2556Q 


gcaatttttg 


tacctctatg 


ccttgatata 


ttttctacaa 


tgcatcaacg 


catgtagaat 


25620 


tattatgaga 


tgttggcttt 


gttggaagtg 


caaatccaag 


aacccattac 


tttatgatgc 


25680 


caactacttt- 


gtttgctggc 


acacacataa 


ctatgactac 


tgtataccat 


ataacagtgt 


25740 


cacagataca 


attgtcgtta 


ctgaaggtga 


cggcatttca 


acaccaaaac 


tcaaagaaga 


25800 


ctaccaaatt 


ggtggttatt 


ctgaggatag 


gcactcaggt 


gttaaagact 


atgtcgttgt 


25860 


acatggctat 


ttcaccgaag 


tttactacca 


gcttgagtct 


acacaaatta 


ctacagacac 


25920 


tg'gtattgaa 


aatgctacat 


tcttcatctt 


taacaagctt 


gttaaagacc 


caccgaatgt 


• 25980 


gcaaatacac 


acaatcgacg 


gctcttcagg 


agttgctaat 


ccagcaatgg 


atccaattta 


26040 


tgatgagccg 


acgacgacta 


ctagcgtgcc 


tttgtaagca 


caagaaagtg 


agtacgaact 


26100 


tatgtactca 


ttcgtttcgg 


aagaaacagg 


tacgttaata 


gttaatagcg 


tacttctttt 


• 26160 


tcttgctttc 


gtggtattct 


tgctagtcac 


actagccatc 


cttactgcgc 


ttcgattgtg 


. 26220 


tgcgtactgc 


■tgcaatattg 


ttaacgtgag tttagtaaaa 


ccaacggttt 


acgtctactc 


26280 


gcgtgttaela 


aatctgaact 


cttctgaagg agttcctgat 


cttctggtct 


aaacgaacta 


26340 


actattatta 


ttattGtgtt, 


tggaacttta 


acattgctta 


tcatggcaga 


caacggtact 


■ 26400 


attaccgttg 


aggagcttaa 


acaactcctg 


gaacaatgga 


acctagtaat 


aggtttccta 


26460 


ttcctagcct 


ggattatgtt 


actacaattt 


gcctattcta 


atcggaacag 


gtttttgtac 


26520 


ataataaagc 


ttgttttcct 


ctggctcttg tggccagtaa 


cacttgcttg 


ttttgtgctt 


. 26580 


gctgctgtct 


acagaattaa 


ttgggtgact 


ggcgggattg 


cgattgcaat 


ggcttgtatt 


26640 


gtaggcttga 


tgtggcttag 


ctacttcgtt 


gcttccttca 


ggctgtttgc 


tcgtacccgc 


. 26700 


tcaatgtggt 


cattcaaccc 


agaaacaaac 


attcttctca 


atgtgcctct 


ccgggggaca 


267 60 


attgtgacca 


gaccgctcat 


ggaaagtgaa 


cttgtcattg 


gtgctgtgat 


cattcgtggt 


26820 


cacttgcgaa 


tggccggaca 


ctccctaggg cgctgtgaca 


ttaaggacct 


gccaaaagag 


26880 


atcactgtgg 


ctacatcacg 


aacgctttct 


tattacaaat 


taggagcgtc 


gcagcgtgta 


26940 
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ggcactgatt 


caggttttgo 


tgcatacaac 


cgctaccgta 


ttggaaacta 


taaat taaat 


27000 


acagaccacg 


Gcggtagcaa 


cgacaatatt 


gctt-tgctag 


tacagtaagt 


gacaacagat 


27060 


gtttcatctt 


•gttgacttcc 


aggttacaat 


agcagagata 


ttgattatca 


tt atgaggac 


27120 


tttcaggatt 


gctatttgga 


atcttgacgt 


tataataagt 


tcaatagtga 


gacaattatt 


27180 


taagcctcta 


actaagaaga 


attattcgga 


gttagatgat 


cxaaCTaaccta 


t Q a a cf 1" +" Cf a 


219 AO 


ttatccataa 


aacgaacatg 


aaaattattc 


tcttcctgac 


attgattgta 


tttaca-f-rtt 

^ ^ d V*>» ct In. Vp* w L» 




gcgagctata 


tcactatcag 


Qaatatatta 


aacfcrtaccrac 

04 \^ 1^ 


t crtz a r*"!" a r* t a 


a a a rra t" t 


^ f O vJ W 


gcccatcagg 


aacatacgag 


ggcaattcac 


catttcaccc 


tcttactaac 


aataasthtcf 

t* t* t- &l d CA (_ \^ 


27420 


cactaacttg 


cactagcaca 


cactttgctt 


ttacttat ac 


t craccrrr'h ^ ct* 


w y ci cx U- ct w L> 


97 4, Q n 

^ / G \J 


atcagctgcg 


tgcaagatca 


gtttcaccaa 


aacttttcat 


caoacaacracr 


Cfacfcrt'l" r'a ac 


21 5 AO 


aagagctcta 


ctcgccactt 


tttctcattg 


ttgctgctct 


acrtattttta 


ai" a p"H i- rrr*"!- 


27 600 


tcaccattaa 


gagaaagaoa 


gaatgaatga 


gctcacttta 


a 1 1 cf a +" c t 


_ X. -L a_ ^i, _ _ A. J. 




tttagccttt 


ctgctatt-CG 


ttgttttaat 


aatgct t, &"tt 


at att tt ocs t 


t't" *r oa r»na 


91190 


aatccaggat 


ctagaagaac 


cttgtaccaa 


agt ct aaacg 


aaci?5l"cr?5?^ac 


4- 4- /->■ 4- ^ a 4- +- rfh 


911 fid 


tttgacttgt 


atttctct^t 


gcagttgcat 


atgcactgtia 


cr "t a c a CI r* rr c t 


y U.y<M>d L.(>_>L>C1CI 




taaacctcat 


gtgcttga^g 


atccttgtaa 


crcrfcacaapac 

^4 w d W4 


tacrrf'rrrfl" at 


dl^lk* L>ClL.Ciy^CL 




ctgcttggct 


ttgtgctcta 


ggaaaggttt 


taGcttt i" ca 


t a GT a i" rrrr f a o 


a f^t* a 4" rrrri" r» 




aaacatgcac 


acctaatgtt 


actatcaact 


gtcaagatcc 


aactaatacrt: 


ccCTG 1 1" a t a cr 


28020 

il^ W w ^ W 


ctaggtgttg 


gtaccttcat 


gaaggtcacc 


aaactgctgc 


attt agagac 


gtactt gtt g 


28080 

■ 4^ \j \j u w 


ttttaaataa 


acgaacaaat' 


taaaatgtct 


gataatggac 


CGcaatcaaa 


ccaacgtagt 


28140 


gccccccgca 


ttacatttgg 


tggacccaca 


gattcaactg 


acaataacca 


aaataaaaaa 


28200 

W ^ w \J 


cgcaatgggg 


■ caaggccaaa 


acagcgccga 


. ccccaaggtt 


tacccaataa 


tact acatct 


28260 


tggttcacag 


ctctcactca 


gcatggcaag 


gaggaactta 


gattcGctcg 


aofcrccaaaQC 

«-*>^>j w>^dyyv^ 


28320 


gtt.ccaatca 


acaccaatag 


"tggtccagat 


gaccaaattg 


get actaccg 


a a g a g ct a c c 


28380 


cgacgagttc 


gtggtggtga 


cggcaaaatg 


aaagagctca 


gccccagatg 


gtacttctat 


28440 


tacctaggaa 


ctggcccaga 


agcttcactt 


ccctacggcg 


ctaacaaaga 


aggcatcgta 


28500 


tgggc-T-gcaa 


ctgagggagc 


cttgaat aca 


cccaaagacc 


acattggcac 


ccgGaatcct • 


28560 


aataacaatg 


ctgccaccgt 


gctacaactt 


cctcaaggaa 


caacattgcc 


aaaaggcttc 


28620 


tacgcagagg 


gaagcagagg 


cggcagtcaa 


gcctcttctc 


gctcctcatc 


acgtagtcgc 


28680 


ggtaattcaa 


gaaattcaac 


tcctggcagc 


agtaggggaa 


attctcctgc 


tcgaatggct 


28740 
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agcggaggtg 


gtgaaactgc 


■cctcgcgcta 


.ttgctgctag 


acagattgaa 


ccagcttgag 


28800 


agcaaagttt 


ctggtaaagg 


ccaacaacaa 


caaggccaaa 


ctgtcactaa 


gaaatctgct 


28860 


gctgaggcat 


ct'aaaaagcc 


tcgccaaaaa 


cgtactgcca 


caaaacagta caacgtcact 


•28920 


caagcatttg 


ggagacgtgg 


tccagaacaa 


acccaaggaa 


atttcgggga 


ccaagaccta * 


28980 


atcagacaag 


gaactgatta 


caaacattgg 


ccgcaaattg 


cacaatttgc tccaagtgcc 


29040 


tctgcattct 


ttggaatgtc 


acgcattggc 


atggaagtca 


caccttcggg 


•aacatggctg 


29100 


acttatcatg 


gagccattaa 


attggatgac 


aaagatccac 


aattcaaaga 


caacgtcata 


29160 


ctgctgaaca 


agcacattga 


cgcatacaaa 


acattcGcac 


caacagagcc taaaaaggac 


2922C) 


aaaaagaaaa 


agactgatga 


agctcagcct 


ttgccgcaga 


gacaaaagaa 


gcagcccact 


29280 


gtgactcttc 


ttcctgcggc 


tgacatggat 


gatttctcca 


gacaacttca 


aaattccatg 


29340 


agtggagctt 


ctgctgattc 


aactcaggca 


taaacactca 


tgatgaccac 


acaaggcaga 


29400' 


tgggctatgt 


aaacgttttc 


gcaattccgt 


ttacgataca 


tagtctactc 


ttgtgcagaa 


29460 


tgaattctcg 


taactaaaca 


gcacaagtag 


gtttagttaa 


ctttaat.ctc 


acatagcaat 


29520 


ctttaatcaa 


tgtgtaacat 


tagggaggac 


ttgaaagagc 


caccacattt 


tcatcgaggc 


29580 


cacgcggagt 


acgatcgagg 


gtacagtgaa 


taatgctagg 


gagagctgcc 


tatatggaag 


*29640 


agccctaatg 


tgtaaaatta 


attttagtag 


tgcf atcccc 


atgtgatttt 


aatagcttct 


29700 


taggagaatg 


acaaaaaaaa 


aaaaaaaaaa 


aaaaaa 






29736 



<210> 2 

<211> 29736 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 2 



ctacccagga 


aaagccaacc 


aacctcgatc 


tcttgtagat 


ctgttctcta 


aacgaacttt 


60 


aaaatctgtg 


tagctgtcgc 


tcggctgcat 


gcctagtgca 


cctacgcagt 


ataaacaata 


12Q 


ataaatttta 


ctgtcgttga 


caagaaacga 


gtaactcgtG 


cctcttctgc 


agactgctta 


180 


cggtttcgtc 


cgtgttgcag tcgatcatca 


gcatacctag 


gtttcgtccg 


ggtgtgaccg 


240 


aaaggtaaga 


tggagagcct 


tgttcttggt 


gtcaacgaga 


aaacacacgt 


ccaactcagt 


300 


ttgcctgtcc 


ttcaggttag agacgtgcta gt'gcgtggct 


tcggggactc 


tgtggaagag 


360 


gccctatcgg 


aggcacgtga 


acacctcaaa 


aatggcactt 


gtggtctagt 


agagctggaa 


420 


aaaggcgtac 


tgccccagct 


tgaacagccc 


tatgtgttca 


ttaaacgttc tgatgcctta 


480 


agcaccaatc 


acggccacaa 


ggtcgttgag 


ctggttgcag 


aaatggacgg 


cattcagtac 


540 


ggtcgtagcg 


gtataacact 


ggga.gtactc 


gtgccacatg 


tgggcgaaac 


cccaattgca - 


-600 
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taccgcaatg 


ttcttcttcg 


taagaacggt . 


aataagggag 


ccggtggtca 


tagctatggc 


660 


atcgatctaa ■ 


agtcttatga 


cttaggtgac 


gagcttggca 


ctgatcccat 


tgaagattat 


720 


gaacaaaact 


ggaacactaa 


gcatggcagt 


ggtgcactcc 


gtgaactcac 


tcgtgagctc * 


780 


aatggaggtg 


c^gtcactcg 


ctat'gtcgac 


aacaatttct 


gtggcGcaga 


tgggtaccct ' 


840 


cttgattgca 


tcaaagattt 


tctcgcacgc 


gcgggcaagt 


caatgtgcac 


tctttccgaa 


900 


caacttgatt 


acatcgagtc 


gaagagaggt 


gtctactgct 


gccgtgacca • 


tgagcatgaa 


960 


attgcctggt 


tcactgagcg 


ctctgataag 


agctacgagc 


accagacacG 


cttcgaiaatt 


1020 


aagagtgcca 


agaaatttga 


cactttcaaa 


ggggaatgcc 


caaagt-ttgiu 


gtttcbtctt. 


1080 


aactcaaaag 


tcaaagtcat 


tcaaccacgt 


gttgaaaaga 


aaaagactga 


gggtttcatg 


1140 


gggcgtatac 


gctctgtgta 


ccctgttgca 


tctccacagg 


agtgtaacaa 


tatgcacttg 


1200 


tctaccttga 


tgaaatgtaa 


tcattgcgat 


gaagtttcat 


ggcagacgtg 


cgactttctg 


1260 


aaagccactt 


gtgaacattg 


tggcactgaa 


aatttagtta 


ttgaaggacc 


tactacatgt 


1320 


gggtacctac 


ctactaatgc 


tgtagtgaaa 


atgccatgtc 


ctgcctgtca 


agacccagag 


1380 


attggacctg 


agcatagtgt 


tgcagattat 


cacaaccact 


caaacattga 


aactcgactc 


1440 


cgcaagggag 


gtaggactag 


atgttttgga 


ggotgtgtgt 


ttgcctatgt 


tggctgctat ♦ 


1500 


aataagcgtg 


cctactgggt 


tcctcgtgct 


agtgctgata 


ttggctcagg 


ccatactggc 


1560 


attactggtg 


acaatgtgga 


gaccttgaat 


gaggatctcc 


ttgagatact 


gagtcgtgaa 


1620 


cgtgttaaca 


ttaacattgt 


' tggcgatttt 


catttgaatg 


aagaggttgc 


catcattttg • 


1680 


gcatctttct 


ctgcttctac 


aagtgccttt 


attgacacta 


ta^agagtct 


tgattacaag 


17 40 


tctttcaaaa 


ccattgttga 


gtccfcgcggt 


aactataaag 


ttaccaaggg 


aaagcccgta. 


1800 


aaaggtgctt 


ggaacattgg 


acaacagaga 


tcagttttaa 


caccactgtg 


tggttttccc 


1860 


tcacaggctg 


ctggtgttat 


cagatcaatt 


tttgcgcgca 


cacttgatgc 


agcaaaccac 


1920 


tcaattcctg 


atttgcaaag 


agcagctgtG 


accatacttg 


atggtatttc 


tgaacagtca 


1980 


ttacgtcttg 


tcgacgccat 


ggtttatact 


tcagacctgc 


tcaccaacag 


tgtcattatt 


2040 


atggcatatg 


taactggtgg 


tcttgtacaa 


cagacttctc 


agtggttgtc 


taatcttttg . 


2100 


ggcactactg 


■ttgaaaaact 


caggcctatc 


tttgaatgga 


ttgaggcgaa 


acttagtgca 


2160 


ggagttgaat 


ttctcaagga 


tgcttgggag 


attctcaaat 


ttctcattac 


aggtgttttt . 


2220. 


gacatcgtca 


agggtcaaat 


acaggttgct 


tcagataaca 


tcaaggattg 


tgtaaaatgc 


2280 


ttcattgatg 


ttgttaacaa 


ggcactcgaa 


atgtgcattg 


atcaagtcac 


tatcgctggc 


2340 


gcaaagttgc 


gatcactcaa 


cttagg-tgaa 


gtcttcatcg 


ctcaaagcaa 


gggactttac 


2400 


cgtcagtgta 


tacgtggcaa 


ggagcagctg 


caactactca 


tgcctcttaa 


•ggcaccaaaa 


2460 
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gaagtaacct 


ttcttgaagg 


tgattcacat 


gacacagtac 


ttacctctga 


craacfattatt: 


2520 


ctcaagaacg 


gtgaactcga 


agcactcgag 


acgcccgt.t.g 


atagcttcac 


aaatggagct 


2580 


atcgttggca 


■caccagtctg 


tgtaaatggc 


ctcatgctct 


tagagattaa 


ggacaaagaa 


2640 


caatactgcg 


cattgtctcc 


tggtttactg 


gctacaaaca 


atgtctttcg 


cttaaaaggg 


2700 


ggtgcaccaa 


ttaaaggtgt 


aacctttgga 


g^aagatactg 


tttQCfcraacft 


t caacra'fc'ha.c 


2760 


aagaatgtga 


gaatcacatt 


tgagcttgat 


gaacgtgttg 


acaaagtgct 


taatgaaaag 


2820 


tgctctgtct 


acactgttga 


atccggtacc 


gaagttactg 


agtttgcatg 


tcfttataaca 


2'880 


gaggctgttg 


tgaagacttt 


acaaccagtt 


tctgatctcc 


ttaccaacat 


gggt atfcgat 


2940 


cttgatgagt 


ggagtgtagc 


tacat-tctac 


ttatttgatg 


atgctggtga 


agaaaacttt 


3000 


tcatcacgta 


tgtattgttc 


cttttaccct 


ccagatgagg 


aacraacraacfa 


ccratCTcacraCf 


• 3060 


tgtgaggaag- 


aagaaattga 


tgaaacctgt 


craacataacft 


acggtacaga 


acrataatitat 


3120 


caaggtctcc 


ctctggaatt 


■tggtgcctca 


gctgaaacag 


ttcgagttga 


CTQaaoraaCTaa 


3180 


gaggaagact 


ggctggatga 


tactactgag 


caatcagaga 


ttgagccaga 


accagaacct 


3240 


acacctgaag 


aaccagttaa 


tcagtttact 


ggttatttaa 


aact t act ga 


caatrrhl" nr'C 

(J4 L« U V*/ 


3300 


attaaatgtg 


ttgacatcgt 


taaggaggca 


caaaglbgcta 

i 


atcctatcrcrt 


aat t crt r3 a ?i t 


3360 


gctgctaaca 


tacacctgaa 


acataataat 


CTcr t cr t a cr c a 0 


crtcrcact" naa 


wMb^u U CI CI v./ 


3420 


aatggtgcca 


tgcaaaagga 


gagtgatgat 


tacattaagc 


t aaatggccc 


tcttacarri" a 


3480 


ggagggtctt 


gtttgctttc 


tggacataat 


cttgctaaga 


agtgtctgca 


tattattaaa 


3540 


cctaacctaa 


atgcaggtga 


ggacatccag 


cttcttaagg 


cagcatatga 


aaatttcaat 


3600 


tcacaggaca 


tcttacttgc 


accattgttg 


tcagcaggca 


tatttggtgc 


taaaccactt 


3660 


cagtctttac 


aagtgtgcgt 


gcagacggtt 


cgtacacagg 


tttatattgc 


agtcaatgac 


3720 


aaagctcttt 


atgagcaggt 


tgtcatggat. 


tatcttgata 


acctgaagcc 


taaaQtaaaa 


3760 


gcacctaaac 


aagaggagcc 


accaaacaca 


gaagattcca 


aaactgaacra 


gaaatctgtc 


3840 


gtacagaagc 


ctgtcgatgt 


gaagccaaaa 


attaaggcct 


gcattgatga 


ggttaccaca 


3900 


acactggaag 


aaactaagtt 


tcttaccaat 


aagttactct 


tgtttgctga 


tatcaatggt 


3960 


aagctttacc 


atgattctca 


gaacatgctt 


agaggtgaag 


atatgtcttt 


ccttgagaag 


4020 


/*t^ ^ a 

gaT.gCoCC L. u 


acanggtiagg 


t-gacgttiauc 


actagtggtg 


atatcacttg 


tgttgtaata 


4080 


ccctccaaaa 


aggctggtgg 


cactactgag 


atgctctcaa 


gagctttgaa 


gaaagtgcca 


4140 


gttgatgagt 


atataaccac 


gtaccctgga 


caaggatgtg 


ctggttatac 


acttgaggaa 


4200 


gctaagactg 


ctcttaagaa 


atgcaaatct 


gcattttatg 


tactaccttc 


agaagcacct 


4260 
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aatgctaagg 


aagagattct • 


aggaactgta . 


tcctggaatt 


tgagagaaat 


gcttgctcat 


4320 


gctgaagaga 


caagaaaatt 


aatgcctata 


tgcatggatg 


ttagagccat 


aatggcaacc ' 


4380 


atccaacgta 


agtataaagg 


aattaaaatt 


caagagggca 


tcgttgacta tggtgtccga " 


4440 


ttcttctttt 


atactagtaa • 


agagcctgta 


gcttctatta 


ttacgaagct 


gaactctcta' 


4500 


aatgagccgc 


ttgtcacaat 


gccaattggt 


tatgtgacac 


atggttttaa -tcttgaagag 


4560 


gctgcgcgct 


gtatgcgttc 


tcttaaagct 


cctgccgtag 


tgtcagtatc atcaccagat 


4620 


gctgttacta 


catataatgg 


atacctcact 


tcgtcatcaa 


agacatctga 


ggagcacttt 


4680 


gtagaaacag 


tttctttggc 


tggctcttac 


agagattggt 


cctattcagg 


acagcgtaca 


4740 


gagttaggtg 


ttgaatttct 


taagcgtggt 


gacaaaattg 


tgtaccacac 


tctggagagc 


4800 


cccgtcgagt 


ttcatcttga 


cggtgaggtt 


ctttcacttg 


acaaactaaa 


gagtctctta 


48.60 


tccctgcggg 


aggttaagac 


tataaaagtg 


ttcacaactg 


tggacaacac 


taatctccac 


4920 


acacagcttg 


tggatatgtc 


tatgacatat 


ggacagcagt 


ttggtccaac 


atacttggat. 


4980 


ggtgctgatg 


ttacaaaaat 


taaacctcat 


gtaaatcatg 


agggtaagac 


tttctttgta 


5040 


ctacctagtg 


atgacacact 


acgtagtgaa 


gctttcgagt 


actaccatac. tcttgatgag 


5100 


agttttcttg 


gtaggtacat 


gtctgcttta 


aaccacacaa 


agaaatggaa 


atttcctcaa * 


5160 


gttggtggtt 


taacttcaa.t 


taaatgggct 


gataacaatt 


gttatttgtc 


tagtgtttta 


5220 


ttagcacttc 


aacagcttga 


agtcaaattc 


aatgcaccag 


cacttcaaga 


ggcttattat 


5280 


agagcccgtg 


ctggtgatgc 


tgctaacttt 


tgtgcactca 


tactcgctta 


cagtaataaa 


5340 


actgttggcg 


agcttggtga 


tgtcagagaa 


actatgacGc 


atcttctaca 


gcatgctaat 


5400 


ttggaatctg 


caaagcgagt 


tcttaatgtg 


gtgtgtaaac 


attgtggtca 


gaaaactact 


5460 


accttaacgg 


gtgtagaagc 


tgtgatgtat 


.atgggtactc 


tatcttatga 


taatcttaag 


5520 


acaggtgttt 


ccattccatg 


tgtgtgtggt 


cgtgatgcta 


cacaatatct 


agtacaacaa 


5580 


gagtcttctt 


ttgttatgat 


gtctgcacca 


cctgctgagt 


ataaattaca 


gcaaggtaca 


5640 


ttcttatgtg 


cgaatgagta 


cactggtaac 


tatcagtgtg 


gtcattacac 


tcatataact 


5700 


gctaaggaga 


Gcctctatcg 


tattgacgga 


gctcacctta 


caaagatgtc 


agagtacaaa , 


57 60 


ggaccagtga 


ctgatgtttt 


ctacaaggaa 


acatcttaca 


ctacaaccat 


caagcctgtg 


5820 


tcgtataaac 


tcgatggagt 


tacttacaca 


gagattgaac 


caaaattgga 


tgggtattat . 


5880 


aaaaaggata 


atgcttacta 


tacagagcag 


cctatagacc 


ttgtaccaac 


tcaaccatta 


5940 


ccaaatgcga 


gt'tttgataa 


tttcaaactc 


acatgttcta 


acacaaaatt 


tgctgatgat 


6000 


ttaaatcaaa 


tgacaggctt 


cacaaagcca 


gcttcacgag 


agctatctgt 


caca-ttctt-c 


6060 


ccagacttga 


atggcgatgt 


agtggctatt 


gactatagac 


actattcagc 


•gagtttcaag 


6120 
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aaaggtgcta/ 


aattactgca 


taagccaatt 


gtttggcaca 


ttaaccaggc 


tacaaccaag 


6180 


acaacgttca 


aaccaaacac 


ttggtgttta- 


cgttgtcttt 


ggagtacaea 


gccagtagat 


6240 


acttcaaatt 


-catttgaagt 


tctggcagta 


gaagacacac 


aaggaatgga 


caatcttgct 


6300 


tgtgaaagtc 


aacaacccac 


©tctgaagaa 


gtagtggaaa 


atcctaccat 


acagaaggaa 


6360 


gtcatagagt. 


gtgacgtgaa 


aactaccgaa 


gttgtaggca 


atgtcatact 


taaaccatca 


6420 


gatgaaggtg 


ttaaagtaac 


acaagagtta 


ggtcatgagg 


atcttatggc 


tgcttatgtg 


6480 


gaaaacacaa 


gcattaccat 


taagaaacct 


aatgagcttt 


cactagccfct 


aggtttaaaa 


6540 


acaattgcca 


ctcatggtat 


tgctgcaatt 


aatagtgttc 


cttggagtaa 


aattttggct 


6600 


tatgtcaaac 


cattcttagg 


acaagcagca 


attacaacat 


caaattgcgc 


taagagatta 


6660 


gcacaacgtg 


tgtttaacaa 


t'tatatgcct 


tatgtgttta 


cattattgtt 


ccaattgtgt 


6720 


acttttacta 


aaagtaccaa 


ttctagaatt 


agagcttcac 


tacctacaac 


tattgctaaa 


6780 


aatagtgtta 


agagtgttgc 


•taaattatgt 


ttggatgccg 


gcattaatta 


tcytcaacrt ca 


6840 


cccaaatttt 


ctaaattgtt 


cacaatcgct 


atgtggctat 


■tgtt.gttaag 


tatttgctta 


6900 


ggttctctaa 


tctgtgtaac 


tgctgctttt 


ggtgtactct 


tatctaattt 


t QQt CTGt nrtit 


5950 


tcttattgta 


atggcgtta^ 


agaattgtat 


cttaattcgt 


ctaacgttac 


iiactatacrat: 


7020 


ttctgtgaag 


gttcttttcc 


ttgcagcatt 


tgtttaagtg 


gattagactc 


ccttaattct 


7080 


tatccagctc 


ttgaaaccat 


tcaggtgacg 


atttcatcgt 


acaagctaga 


cttgacaatt 


7140 


ttaggtctgg 


ccgctgagtg 


ggttttggca 


tatatgttgt 


tcacaaaatt 


cttttattta 


7200 


ttaggtcttt 


cagctataat 


gcaggtgttc 


tttggctatt 


ttgctagtca 


tttcatcagc 


7260 


aattcttggc 


tcatgtggtt 


tatcattagt 


attgtacaaa 


tggcacccgt 


ttctgcaatg 


7320 


gttaggatgt 


acatcttctt 


tgcttctttc 


tactacatat 


ggaagagcta 


tgttcatatc 


7380 


atggatggtt 


gcacctcttc 


gacttgcatg 


atgtgctata 


agcgcaatcg 


tgccacacgc 


7440 


gttgagtgta 


caactattgt 


taatggcatg 


aagagatctt 


tctatgtcta 


tgcaaatacra 


7500 


ggccgtggct 


tctgcaagac 


tcacaattgg 


aattgtctca 


attgtgacac 


attttgcact 


7560 


ggtagtacat 


tcattagtga 


tgaagttgct 


cgtgatttgt 


cactccagtt 


taaaagacca 


7620 


atcaacccta 


ctgaccagtc 


atcgtatatt 


gttgatagtg 


ttgctgtgaa 


aaatggcgcg 


7680 


%^ ^ ^ \^ W W 






ci o, go. ccT. a ^ g 


agagacatcc 


gctctcccat 


1 /40 


tttgtcaatt 


tagacaattt 


gagagctaac 


aacactaaag 


gttcactgcc 


tattaatgtc 


7800 


atagtttttg 


atggcaagtc 


caaatgcgac 


gagtctgctt 


ctaagtctgc 


ttctgtgtac 


7860 


tacagtcagc 


tgatgtgcca 


acctattctg 


ttgcttgacc 


aagctcttgt 


atcagacgtt 


7920 
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ggagatagta 


ctgaagtttc cgtt'aagatg 


tttgatgctt 


atgtcgacac 


cttttcagca 


7980 


acttttagtg 


ttcctatgga aaaacttaag 


gcacttgttg 


ctacagctca 


cagcgagtta 


8040 


gcaaagggtg 


tagctttaga tggtgtcctt 


tctacattcg 


tgtcagctgc ccgacaaggt " 


8100 


gttgttgata 


ccgatgttga c'acaaaggat 


gttattgaat 


gtctcaaact 


ttcacatcac 


8160 


tctgacttag aagtgacagg tgacagttgt 


aacaatttca 


tgctcaccta taataaggtt 


8*220 


gaaaacatga 


cgcccagaga tcttggcgca 


tgtattgact 


gtaatgcaag- 


gcatatcaat 


8280 


gcccaagtag 


caaaaagtca caatgtttca 


ctcatctgga 


atgtaaaaga 


ctacatgtct 


8340 


ttatctgaac 


agctgcgtaa acaaattcgt 


agtgctgcca 


agaagaacaa 


cataccttt-t 


8400 


agactaactt 


gtgctacaac tagacaggtt 


gtcaatgtca 


taactactaa 


aatctcactc 


8460 


aagggtggta 


agattgttag tacttgtttt 


aaacttatgc 


ttaaggccac 


attattgtgc 


8520 


gttcttgctg* 


cattggtttg ttatatcgtt 


atgccagtac 


atacattgtc 


aatccatgat 


8580 


ggttacacaa 


atgaaatcat tggttacaaa 


gccattcagg 


atggtgtcac 


tcgtgacatc 


8640 


atttctactg 


atgattgttt tgcaaataaa 


catgctggtt " 


ttgacgcatg 


gtttagccag 


8700 


cgtggtggtt 


catacaaaaa tgacaaaagc 


tgccctgtag 


tagctgctat 


cattacaaga 


8760 


gagattggtt tcatagtgcc tggcttaccg 


ggfcactgtgc 


tgagagcaat 


caatggtgac * 


•8820 


ttcttgcatt ttctacctcg tgtttttagt 


get gttggca 


acatttgcta 


cacaccttcc 


8880 


aaactcattg 


agtatagtga ttttgctacc 


tctgcttgcg 


ttcttgctgc 


tgagtgtaca 


8940 


atttttaagg 


atgctatggg caaacctgtg 


ccatattgtt 


atgacactaa 


tttgctagag 


■ 9000 


ggttctattt 


cttatagtga gcttcgtcca 


gacactcgtt 


atgtgcttat ggatggttcG 


9060 


^tcatacagt 


ttcctaacac ttacctggag 


ggttctgtta 


gagtagtaac 


aacttttgat 


9120 


gctgagtact 


gtagacatgg tacatgcgaa 


aggtcagaag 


taggtatttg 


cctatctacc 


9180 


agtggtagat 


gggttcttaa taatgagcat 


tacagagctc 


tatcaggagt 


tttctgtggt . 


9240 


gttgatgcga 


tgaatctcat agctaacatc 


tttactcctc 


ttgtgcaacc tgtgggtgct 


9300 


ttagatgtgt 


ctgcttcagt agtggctggt 


ggtattattg 


ccatattggt 


gacttgtgct 


9360 


gcctactact 


ttatgaaatt cagacgtgtt 


tttggtgagt 


acaaccatgt 


tgttgctgct . 


9420 


aatgcadttt 


tgtttttgat gtctttcact 


atactctgtc 


tggtaccagc ttacagcttt 


9480 


ctgccgggag tctactcagt cttttacttg 


tacttgacat 


tctatttcac 


caatgatgtt . 


9540 


tcattcttgg 


ctcaccttca atggtttgcc 


atgttttctc 


ctattgtgcc 


tttttggata 


9600 


acagcaatct 


atgtattctg tatttctctg 


aagcactgcc 


attggttctt 


taacaactat 


9660 


cttaggaaaa gagtcatgtt taatggagtt 


aca-tttagta 


ccttcgagga 


ggctgctttg 


9720 


tgtacctttt 


tgctcaacaa ggaaatgtac 


ctaaaattgc 


gtagcgagac 


actgttgcca 


9780 
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cttacacagt • 


ataacaggta 


tcttgctcta 


tataacaagt 


acaagtattt 


cacrtcfQaorcc 


9840 


ttagatacta 


ccagctatcg 


tgaagcagct 


tgctgccact 


"tagcaaaggc 


tctaaatgac 


9900 


tt'tagcaact 


•caggtgctga 


tgttctctac 


caaccaccac 


agacatcaal: 


cacttctgct 


9960 


gttctgcaga 


gtggttttag 


gaaaatggca 


ttcccgtcag 


gcaaagttga 


aaaa t aca t cr 


10020 


gtacaagtaa 


cctgtggaac 


tacaactctt 


aatggattgt 


cjcft t acf at cf a 


cacaatatac 


10080 


tgtccaagac 


atgtcatttg 


cacagcagaa 


gacatgctta 


atcctaacta 


tcraaaatcta 


10140 


ctcattcgca 


aatccaacca 


tagctttctt 


gttcaggct g 


Cf caa t crt" +" ca 


a c t" i~rTr t"rri"l" 




attggccatt 


ctatgcaagia 


ttgtctgctt 


aggcttaaag 


ttgatactt c 


taaccctaag 


10260 


acacGcaagt 


ataaatttgt 


ccgtatccaa 


cctggtcaaa 


catttt caat 

\^ \^ \mm \^ V> ^ 


t c t acTp a 1" fjc 

^ ^ w ^ N-» d w y w 


10320 


tacaatggtt 


caccatctgg 


tgtttatcag 


tgtgccatga 


gacctaatca 


taccattaaa 


10380 


ggttctttcc- 


ttaatggatc 


atgtggtaat 


gttggtttta 


acattgatta 


t crattacato 


10440 


tctttctgct 


atatgcatca 


tatggagctt 


ccaacaggag 


t a ca c crc i" Cf Q 


tact era r't +* a 


10500 


gaaggtaaat 


tctatggtcc 


atttgt.t.gsc 


agacaaactg 


ca c acs cj C"h cs c 


aCtCf't' ?in?^rj?ir' 

L-- d d ^ &l V«< 


10560 


acaaccataa 


cattaaatgt 


tttggcatgg 


ctgtatgctg 


>w I— L» Is- L- V-f fc* 






tggtttctta 


atagattcac 


cactactttg 


aatgacttta 


accttat cTCic 


aat craa nt ?4C 


X w u o u 


aactatgaac 


ctttgacaca 


agatcatgtt 


gacatattgg 


aacctctttic 


tCfGt caaana 




ggaattgccg 


tcttagatat 


gtgtgctgct 


ttgaaagagc 


tQctcrcaoraa 

W V*« 1^ Vj Cri4 1^ V**" '-^ 


tcfcrtat era At 


10800 


ggtcgtacta 


tccttggtag 


cactatttta 


gaagatgagt 


ttacaccatt 


taatattatt 


10860 


agacaatgct 


ctggtgttac 


cttccaaggt 


aagttcaaga 


aaattgttaa 


gggcactcat 


10920 


cattggatgc 


ttttaacttf 


cttgacatca 


ctattgattc 


ttgttcaaag 


tacacagtgg 


10980 


tcactgtttt 


tctttgttta 


cgagaatgct 


ttcttgccat 


ttactcttgg 


tattatggca 


11040 


attgctgcat 


gtgctatgct 


gcttgttaag 


cataagcacg 


cattcttgtg 


e 1 1 gt 1 1 c t g 


11100 


ttaccttctc 


ttgcaacagt 


tgcttacttt 


aatatggtct 


acat:gcctgc 


tacrctaaato 


11160 


atgcgtatca 


tgacatggct 


tgaattggct 


gacactagct 


tgtctggt."ta 


taggcttaag 


11220 


gattgtgtta 


tgtatgcttc 


agctttagtt 


ttgcttattc 


tcatgacagc 


tcgcactgtt 


11280 


tatgatgatg 


ctgctagacg 


tgtttggaca 


ctgatgaatg 


tcattacact 


tgtttacaaa 


11340 


gtctactat-g 


gt aatgct tt 


agatcaagct 


atttccatgt 


gggccttagt 


tatttctgta 


11400 


acctctaact 


attctggtgt 


cgttacgact 


atcatgtttt 


tagctagagc 


tatagtgtt.t 


11460 


gtgtgtgttg 


agtattaccc 


attgttattt 


attactggca 


acaccttaca 


gtgtatcatg 


11520 


cttgtttatt 


gtttcttagg 


ctattgttgc 


tgctgctact 


ttggcctttt 


ctgtttactc 


11580 
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aaccgttact 


tcaggcttac 


tcttggtgtt tatgactact 


tggtctctac acaagaattt 


.11640 


aggtatatga 


actcccaggg 


gcttttgcct 


cctaagagta 


gtattgatgc 


tttcaagctt 


11700 


aacattaagt 


tgttgggtat 


tggaggtaaa 


ccat.gtatca 


aggttgctac 


tgtacagtct 


11760 


aaaatgtctg 


acgtaaagtg 


cacatctgtg 


gtactgctct 


cggttcttca acaacttaga 


11820 


gtagagtcat 


cttctaaatt 


gtgggcacaa 


tgtgtacaac 


tccacaatga 


tattcttctt 


11880 


gcaaaagaca 


caactgaagc 


tttcgagaag 


atggtttctc 


ttttgtctgt 


tttgctatcc 


11940 


atgcagggtg 


ctgtagacat 


taataggttg tgcgaggaaa 


tgctcgataa 


ccgtgctact 


12000 


cttcaggcta 


ttgcttcaga 


atttagttct 


ttaccatcat 


atgccgctta 


tgccactgcc 


12060 


caggaggcct 


atgagcaggc 


tgtagctaat 


ggtgattctg 


aagtcgttct 


caaaaagtta 


1212D 


aagaaatctt 


tgaatgtggc taaatctgag. tttgaccgtg 


atgctgccat 


gcaacgcaag 


12180 


ttggaaaaga 


tggcagatca 


ggctatgacc 


caaatgtaca 


aacaggcaag 


atctgaggac 


,12240 


aagagggcaa 


aagtaactag 


tgctatgcaa 


acaatgctct 


tcactatgct 


taggaagctt 


12300 


gataatgatg 


cacttaacaa 


cattatcaac 


aatgcgcgtg 


atggttgtgt 


tccactcaac 


12360 


atcataccat 


tgactacagc agccaaactc 


atggttgttg 


tccctgatta • tggtacctac " 


12420 


aagaacactt 


gtgatggtaa 


cacctttaca 


tatgca-tctg 


cactctggga 


aatccagcaa 


12480 


gttgttgatg ■ 


cggatagcaa 


gattgttcaa 


-cttagtgaaa 


ttaacatgga 


caattcacca 


1254 0 


aatttggctt 


ggcctcttat 


tgttacagct 


ctaagagcca 


actcagctgt 


taaactacag 


12600 


aataatgaac 


tgagtccagt 


agcactacga 


cagatgtcct 


gtgcggctggi taccacacaa 


12660 


acagcttgta 


ctgatgacaa 


tgcacttgcc 


tactataaca' 


■attcgaaggg aggtaggttt 


12720 


gtgctggcat 


tactatcaga 


ccaccaagat 


ctcaaatggg 


ctagattccc 


taagagtgat 


12780 


ggtacaggta 


caatttacac 


agaactggaa 


ccaccttgta 


ggtttgttac 


agacacacca 


12840 


aaagggccta 


aagtgaaata 


cttgtacttc 


atcaaaggct 


taaacaacct 


aaatagaggt 


12900 


atggtgctgg 


gcagtttagc 


tgctacagta 


cgtcttcagg 


ctggaaatgc 


tacagaagta 


12960 


cctgccaatt 


caactgt'gct 


ttccttctgt 


gcttttgcag 


■ tagaccctgc 


taaagcatat 


13020 


aaggattacc 


tagcaagtgg 


aggacaacca 


atcaccaact 


gtgtgaagat 


gttgtgtaca 


13080 


cacactggta 


caggacaggc 


aattactgta 


acaccagaag 


ctaacatgga 


ccaagagtcc 


. 13140 


tttggtggtg 


cttcatgttg 


tctgtattgt 


agatgccaca 


ttgaccatcc 


aaatcctaaa 


13200 


ggattctgtg 


acttgaaagg 


taagtacgtc 


caaataccta 


ccacttgtgc 


taatgaccca 


13260 


9tgggtttta 


cacttagaaa 


cacagtctgt 


accgtctgcg 


gaatgtggaa 


aggttatggc 


13320 


tgtagttgtg 


accaactccg 


cgaaccct-tg 


atgcagtctg 


cggatgcatc 


aacgttttta 


13380 


aacgggtttg 


cggtgtaagt 


gcagcccgtc 


ttacaccgtg 


cggcacaggc 


actagtactg 


13440 
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atgtcgtcta 


cagggctttt 


gatatttaca 


acgaaaaagt 


tgctggtttt 


gcaaagttcc 


13500 


taaaaactaa 


ttgctgtcgc ttccaggaga 


aggatgagga 


aggcaattta 


ttagactett 


13560 


actttgtagt 


taagaggcat 


actatgtcta 


actaccaaca 


tgaagagact 


atttataact 


13620 


tggtt'aaaga 


ttgtccagcg 


gttgctgtcc 


atgacttttt 


caagtttaga 


gtagatggtg 


13680 


acatggtacc 


acatatatca 


cgtcagcgtc 


taactaaata 


cacaatggct 


gatttagtct 


13740 


atgctctacg 


tcattttgat 


gagggtaatt 


gtgatac.att 


aaaagaaat:a 


ctcgtcacat 


13800 


acaattgctg 


tgatgatgat tatttcaata 


agaaggattg 


gtatgacttc 


gtagagaatc 


13860 


ctgacatctt 


acgcgtatat 


gctaacttag 


gtgagcgtgt 


acgccaatca 


ttattaaaga 


13920 


ctgtacaatt 


ctgcgatgct 


atgcgtgatg 


caggcattgt 


aggcgtactg 


acattagata 


13980 


atcaggatct 


taatgggaac 


tggtacgatt 


tcggtgattt 


cgtacaagta 


gcaccaggct 


14040 


gcggagttcc 


tattgtggat 


tcatattact 


cattgctgat 


gcccatcctc 


actttgacta 


14100 


gggcattggc 


tgctgagtcc 


catatggatg 


ctgatctcgc aaaaccactt 


attaacrtacTQ 


14160 


atttgctgaa 


atatgatttt 


acggaagaga 


gactttgtct 


cttcgaccgt 


tattttaaat 


14220 


attgggacca 


gacataccat 


cccaattgta 


ttaactgttt 


ggatgatagg 


tgt atccttc 


14280 


attgtgcaaa 


ctttaatgtg 


ttattttcta 


ctgtgtttcc 


acctacaagt 


tttggaccac 


14340 


tagtaagaaa 


aatatttgta 


gatggtgttc 


cttttgttgt 


ttcaactgga 


taccattttc 


14400 


gtgagttagg 


agtcgtacat 


aatcaggatg 


taaacttaca 


tagctcgcgt 


ctcagtttca 


14460 


aggaactttt 


agtgtatgct 


gctgatccag 


ctatgcatgc 


agcttctggc 


aatttattgc 


14520 


tagataaacg 


cactacatgc 


ttttcagtag 


ctgcactaac 


aaacaatgtt 


gcttttcaaa 


14580 


ctgtcaaacc 


cggtaatttt 


aataaagact 


tttatgactt 


tgctgtgtct 


aaaggtttct 


14640 


ttaaggaagg 


aagttctgtt 


gaactaaaac 


acttcttctt 


tgctcaggat 


ggcaacgctg 


14700 


ctatcagtga 


ttatgactat tatcgttata 


atctgccaac 


aatgtgtgat 


atcagacaac 


14760 


tcctattcgt 


agttgaagtt 


gttgataaat 


actttgattg 


ttacgatggt 


ggctgtatta 


14820 


atgccaacca 


agtaatcgtt 


aacaatctgg 


at.aaat.cagc tggtttccca 


' tttaataaat 


14880 


ggggtaaggc 


tagactttat 


tatgactcaa 


tgagttatga 


ggatcaagat 


gcacttttcg 


14940 


cgtatactaa 


gcgtaatgtc 


atccctacta 


taactcaaat 


gaatcttaag 


tatgocatta 


150D0 


/~f 4- a a /^a 2j 

g t-goaoetgaci 


tagagctcgc 


accgtagctg 


gtgtctctat 


ctgtagtact 


atgacaaata 


1 c r\ r\ 

15060 


gacagtttca 


tcagaaatta 


ttgaagtcaa 


tagccgccac tagaggagct 


actgtggtaa 


15120 


ttggaacaag 


caagttttac 


ggtggctggc 


ataatatgtt 


aaaaactgtt 


tacagtgatg 


15180 


tagaaactcc 


acaccttatg 


ggttgggatt 


atccaaaatg 


tgacagagcc 


atgcctaaca 


15240 
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tgcttaggat 


aatggcctct 


cttgttcttg 


ctcgcaaaca 


taacacttgc 


tgtaacttat 


•15300 


cacaccgttt 


ctacaggtta 


gctaacgagt 


gtgcgcaagt 


attaagtgag 


atggtcatgt 


15360 


gtggcggctc 


actatatgtt 


aaaccaggtg 


gaacatcatc 


cggtgatgct 


acaactgctt 


15420 


atgctaatag 


tgtctttaac 


atttgtcaag 


ctgttacagc 


caatgtaaat 


gcacttcttt 


15480 


caactgatgg 


taataagata 


gctgacaagt 


atgtccgcaa 


tctacaacac 


aggctctatg 


15540 


agtgtctcta 


tagaaatagg 


gatgttgatc 


atgaattcgt 


ggatgagttt 


tacgcttacc 


15600 


tgcgtaaaca 


tttctccatg 


atgattcttt 


ctgatgatgc 


cgttgtgtgc 


tataacagta 


15660 


actatgcggc 


tcaaggttta 


gtagctagca 


ttaagaactt. 


taaggcag-tt 


ctttatta-tc 


15120 


aaaataatgt 


gttcatgtct 


gaggcaaaat 


gttggactga 


gactgacctt 


actaaaggac 


15780 


ctcacgaatt 


ttgctcacag 


catacaatgc 


tagttaaaca 


aggagatgat 


tacgtgtacc 


15840 


tgccttaccc 


agatccatca 


agaatattag 


gcgcaggctg 


ttttgtcgat 


gatattgtca 


15900 


aaacagatgg' 


tacacttatg 


attgaaaggt 


•tcgtgtcact 


"ggctattgat 


gcttacccac 


15960 


ttacaaaaca 


tcctaatcag 


gagtatgctg 


atgtctttca 


Gttgtattta 


caatacatta 


16020 


gaaagttaca 


tgatgagctt 


actggccaca 


tgttggacat 


gtattccgta 


■atgctaacta ' 


16080 


atgataacac 


ctcacggtac 


tgggaacctg 


agttttatga 


ggctatgtac 


acaccacata 


16140 


cagtcttgca 


. ggctgtaggt 


gcttgtgtat 


tgtgcaattc 


acagacttca 


cttcgttgcg 


16200 


gtgcctgtat 


taggagacca 


ttcctatgtt 


gcaagtgctg 


ctatgaccat 


gtcatttcaa 


16260 


catcacacaa 


attagtgttg 


tct'gttaatc 


cctatgtttg 


caatgcccca 


ggttgtgatg 


1632 0 


tcactgatgt 


gacacaactg 


tatctaggag 


gtatgagcta 


•ttattgcaag 


teacataagc 


16380 


ctcccattag 


ttttccatta 


tgtgctaatg 


gtcaggtttt 


tggtttatac 


aaaaacacat 


16440 


gtgtaggcag 


tgacaatgtc 


actgacttca 


atgcgatagc 


aacatgtgat 


tggactaatg 


16500 


ctggcgatta 


catacttgcc 


aacacttgta 


ctgagagact 


caagcttttc 


gcagcagaaa 


16560 


cgctcaaagc 


cactgaggaa 


acatttaagc 


tgtcatatgg 


tattgccact 


gtacgcgaag 


16620 


tactctctga 


cagagaattg 


catctttcat 


gggaggttgg 


aaaacctaga 


ccaccattga 


16680 


acagaaacta 


tgtctttact 


ggttaccgtg 


taactaaaaa 


tagtaaagta 


cagattggag 


16740 


agtacacctt 


tgaaaaaggt 


gactatggtg 


atgctgttgt 


gtacagaggt 


actacgacat 


16800 


acaagttgaa 


tgttggtgat 


tactttgtgt 


tgacatctca 


cactgtaatg 


ccacttagtg 


16860 


cacctactct 


agtgccacaa 


gagcactatg 


tgagaattac 


tggcttgtac 


ccaacactca 


16920 


acatctcaga 


tgagttttct 


agcaatgttg 


caaattatca 


aaaggtcggc 


atgcaaaagt 


16980 


actctacact 


ccaaggacca 


cctggtactg 


g-taagagtca 


tttt-gccatc 


ggacttgctc 


17040 


tctattaccc 


atctgctcgc 


atagtgtata 


cggcatgctc 


tcatgcagct 


gttgatgccc 


17100 
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. tatgtgaaaa ggcattaaaa tatttgccca tagataaatg- tagtagaatc atacctgcgc 17160 

gtgcgcgcgt agagtgtttt' gataaattca aagtgaattc aacactagaa cagtatgttt 17220 

, tctgcactgt aaatgcattg ccagaaacaa ctgctgacat tgtagtcttt gatgaaatct 17280 

ctatggctac taattatgac ttgagtgttg tcaatgctag acttcgtgca aaacactacg 17340 

tctatattgg* cgatcctgct caattaccag ccccccgcac attgctgact aaaggcacac 17400 

tagaaccaga atattttaat tcagtgtgca gacttatgaa aacaataggt ccagacatgt 174 60 

tccttggaac ttgtcgccgt tgtcctgctg aaattgttga cactgtgagt gctttagttt 17520 

atgacaataa gctaaaagca cacaaggata agtcagctca atgcttcaaa atgttctaca 17580 

aaggtgttat tacacatgat gtttcatctg caatcaacag acctcaaata ggcgttgtaa 17640 

gagaatttcf tacacgcaat cctgcttgga gaaaagctgt ttttatctca ccttataatt 17700 

cacagaacgc tgtagcttca aaaatcttag gattgcctac gcagactgtt gattcatcac 177 60 

agggttctga atatgactat ^ gtcatattca cacaaactac tgaaacagca cactcttgta 17820. 

atgtcaaccg cttcaatgtg gctatcacaa gggcaaaaat tggcatttt'g tgcataatgt 17880 

ctgatagaga tctttatgac aaactgcaat ttacaagtct agaaatacca cgtcgcaatg 17940 

tggctacatt acaagcagaa aatgtaactg gactttttaa ggactgtagt aagatcatta 18000 

ctggt'cttca tcctacacag gcacctacac acctcagcgt tgatatstaag ttcaagactg ' 18060 
aaggattatg tgttgacata ccaggcatac caaaggacat gaicctaccgt agactcatct • 18120 

ctatgatggg tttcaaaatg aattaccaag tcaatggtta ccctaatatg tttatcaccc 18180 

gcgaagaagc tattcgtcac gttcgtgcgt ggattggctt tgatgtagag ggctgtcatg 18240 

caactagaga tgctgtgggt actaacctac ctctccagct aggattttct acaggtgtta 18300 

acttagtagc tgtaccgact ggttatgttg acactgaaaa taacacagaa ttcaccagag 18360 

ttaatgcaaa acctccacca ggtgaccagt ttaaacatct tataccactc atgtataaag 18420 

gcttgccctg gaatgtagtg cgtattaaga tagtac'aaat gctcagtgat acactgaaag 184 80 

gattgtcaga cagagtcgtg ttcgtccttt gggcgcatgg ctttgagctt acatcaatga 18540 

agtactttgt caagattgga cctgaaagaa cgtgttgtct gtgtgacaaa cgtgcaactt 18600 

gcttttctac ttcatcagat acttatgcct gctggaatca ttctgtgggt tttgactatg 18 660 

tctataaccc atttatgatt gatgttcagp agtggggctt tacgggtaac cttcagagta 18720 

accatgacca acattgccag gtacatggaa atgcacatgt ggctagttgt gatgctatca 18780 

tgactagatg tttagcagtc catgagtgct ttgttaagcg cgttgattgg tctgttgaat 18840 

accctattat aggagatgaa ctgagggtta attctgcttg cagaaaagta caacacatgg 18900 
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ttgtgaagtc 


tgcattgctt 


gctgataagt 


ttccagttct 


tcatgacatt 


ggaaatccaa 


18960 


aggctatcaa 


gtgtgtgcct 


caggctgaag 


tagaatggaa 


gttctacgat 


gctcagccat 


19020 


gtagtgacaa 


agcttacaaa 


atagaggaac- 


tcttctattc 


ttatgctaca 


catcacgata 


19080 


aattca'ctga 


tggtgtttgt 


ttgttttgga 


attgtaacgt 


tgatcgttac. 


ccagccaatg 


19140 


caattgtgtg 


taggtttgac 


acaagagtct 


tgtcaaactt 


gaacttacca 


ggctgtgatg 


19200 


gtggtagttt 


gtatgtgaat 


aagcatgcat 


tccacactcc 


agctttcgat 


aaaagtgcat 


19260 


ttactaattt 


aaagcaattg 


cctttctttt 


actattctga 


.tagtccttgt 


gagtctcatg 


19320 


gcaaacaagt 


agtgtcggat 


attgattatg 


ttccactcaa 


atctgctacg 


tg-tatiiacac 


19380 


gatgcaattt 


aggtggtgct 


gtttgcagac 


accatgcaaa 


tgagtaccga 


cagtacttgg 


19440 


atgcatataa 


tatgatgatt 


tctgctggat 


ttagcctatg 


gatttacaaa 


caatttgata 


19500 


cttataacct 


gtggaataca 


tttaccaggt 


tacagagttt 


agaaaatgtg 


gcttataatg 


19560 


ttgttaataa ■ 


aggacacttt 


gatggacacg 


ccggcgaagc 


"acctgtttcc 


atcattaata 


19620 


atgctgttta 


cacaaaggta 


gatggtattg 


atgtggagat 


ctttgaaaat 


aagacaacac 


19680 


ttcctgttaa 


tgttgcattt 


gagctttggg 


ctaagcgtaa 


cattaaacca 


•gtgccagaga ' 


19740 


ttaagatact 


caataatttg 


ggtgttgata 


tcgctgctaa 


tactgtaatc 


tgggactaca 


19800 


aaagagaagc 


cccagcacat 


gtatctacaa 


taggtgtctg 


cacaatgact 


gacattgcca 


19860 


agaaacctac 


tgagagtgct 


tgttcttcac 


ttactgtctt 


gtttgatggt 


agagtggaag 


19920 


gacaggtaga 


cctttttaga 


aacgcccgta 


atggtgtttt 


aataacagaa 


ggttcagtca 


19980 


aaggtctaac 


accttcaaag 


ggaccagcac 


aagctagcgt.- 


caatggagt-c 


acattaattg 


20040 


gagaatcagt 


aaaaacacag 


tttaactac::t 


ttaagaaagt 


agacggcatt 


attcaacagt 


20100 


tgcctgaaac 


ctactttact 


cagagcagag 


acttagagga 


ttttaagccc 


agatcacaaa 


20160 


tggaaactga 


ctttctcgag 


ctcgctatgg 


atgaattcat 


acagcgatat 


aagctcgagg 


20220 


gctatgcctt 


cgaacacatc 


gtttatggag 


atttcagtca 


tggacaactt 


ggcggtcttc 


20280 


atttaatgat 


aggcttagcc 


aagcgctcac 


aagattcacc 


acttaaatta 


gaggatttta 


20340 


tccctatgga 


cagcacagtg 


aaaaattact 


tcataacaga 


tgcgcaaaca 


ggttcatcaa 


20400 


aatgtgtgtg 


ttctgtgatt 


gatcttttac 


ttgatgactt 


tgtcgagata 


ataaagtcac 


20460 


aagatttgtc 


agtgatttca 


aaagtggtca 


aggttacaat 


tgactatgct 


gaaatttcat 


20520 


tcatgctttg 


gtgtaaggat 


ggacatgttg 


aaaccttcta 


cccaaaacta 


caagcaagtc 


20580 


aagcgtggca 


accaggtgtt 


gcgatgccta 


acttgtacaa 


gatgcaaaga 


atgcttcttg 


20640 


aaaagtgtga 


ccttcagaat 


tatggtgaaa 


atgctgttat 


accaaaagga 


ataatgatga 


2070D 


atgtcgcaaa 


gtatactcaa 


ctgtgtcaat 


acttaaatac 


acttacttta 


gctgtaccct 


20760 
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acaacatgag agttattcac , tttggtgctg gctctgataa- 


aggagttgca ccaggtacag 


20820 


ctgtgctcag acaatggttg 


ccaactggca 


cactacttgt 


cgattcagat 


cttaatgact 


20880 


tcgtctccga cgcagattct actttaattg gagactgtgc 


aacagtacat 


acggctaata 


2094*0 


aatgggacct 


tattattagc 


gata.tgtatg 


accctaggac 


caaacatgtg 


acaaaagaga 


21000 


atgactctaa ' 


agaagggttt 


ttcacttatc 


tgtgtggatt 


tataaagcaa 


aaactagccc 


21060 


tgggtggttc tatagctgta aagataacag agcattcttg gaatgctgac ctttacaagc 


21120 


ttatgggcca tttctcatgg tggacagctt 


ttgttacaaa 


tgtaaatgca 


tcatcatcgg 


21180 


aagcattttt 


aattggggct 


aactatcttg 


gcaagccgaa 


ggaacaaatt 


gatggctata 


21240 


ccatgcatgc 


taactacatt 


ttctggagga 


acacaaatcc 


tatccagttg 


tcttcctatt 


21300 


cactctttga 


catgagcaaa 


tttcctetta 


aattaagagg 


aactgctgta. 


atgtctctta 


21360 


aggagaatca aatca'atg^t 


atgatttatt 


ctcttctgga 


aaaaggtagg 


cttatcatta 


21420 


gagaaaacaa 


cagagttgtg 


gtttcaagtg 


atattcttgt 


taacaactaa 


acgaacatgt 


21480 


ttattttctt 


attatttctt 


actctcacta gtggtagtga ccttgaccgg tgcaccactt 


21540 


ttgatgatgt 


tcaagctcct 


aattacactc 


aacatacttc 


atctatgagg 


ggggtttact 


21600 


atcctgatga 


aatttttaga 


tcagacactc 


tttatttaac 


tcaggattta 


tttcttccat 


21660 


tttattctaa tgttacaggg ttfcpatacta ttaatcatac gtttggcaac cctgtcatac 


21720 


cttttaagga 


tggtatttat 


tttgctgcca 


cagagaaatc 


aaatgttgtc 


cgtggttggg • 


■ 21780 


tttttggttc 


taccatgaac 


aacaagtcac 


agtcggtgat 


tattattaac 


aattctacta 


21840 


atgttgttat 


acgagcatgt 


aactttgaat 


tgtgtgacaa 


ccctttcttt 


gctgtttcta 


21900 


aacccatggg 


tacacagaca 


catactatga 


tattcgataa 


tgcatttaat 


tgcactttcg 


21960 


agtacatatc 


tgatgccttt 


tcgcttgatg 


tttcagaaaa 


gtcaggtaat 


tttaaacact 


22020 


tacgagagtt 


tgtgtttaaa 


aataaagatg 


ggtttctcta 


tgtttataag 


ggctatcaac 


22080 


ctatagatgt 


agttcgtgat 


ctaccttctg 


gttttaacac 


tttgaaacct 


atttttaagt 


22140' 


tgcctcttgg 


tattaacatt 


acaaatttta 


gagccattct 


tacagccttt 


tcacctgctc 


22206 


aagacatttg 


gggcacgtca 


gctgcagcct 


attttgttgg 


, ctatttaaag 


ccaactacat 


22260 


ttatgctcaa 


gtatgatgaa 


aatggtacaa 


tcacagatgc 


tgttgattgt 


tctcaaaatc 


22320 


cacttgctga 


actcaaatgc 


tctgttaaga 


gctttgagat 


tgacaaagga 


atttaccaga 




cctctaattt 


cagggttgtt 


ccctcaggag atgttgtgag attccctaat 


attacaaact 


22440 


tgtgtccttt 


tggagaggtt 


tttaatgcta 


ctaaattccc 


ttctgtctat 


gcatgggaga 


22500 


gaaaaaaaat 


ttctaattgt 


gttgctgatt 


actctgtgct 


ctacaactca 


acattttttt 


22560 
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caacctttaa 


gtgctatggc- 


gtttctgcca . 


ctaagttgaa 


tgatctttgc 


ttctccaatg 


22620 


tctatgcaga 


ttcttttgta 


gtcaagggag 


atgatgtaag 


acaaatagcg 


ccaggacaaa 


; 22680 


ctggtgttat 


tgctgattat 


aattataaat 


tgccagatga 


tttcatgggt 


tgtgtccttg 


■'22740 


cttggaatac 


taggaacatt 


gatgctactt 


caactggtaa 


ttataattat 


aaatataggt ' 


22800 


atcttagaca 


tggcaagctt 


aggccctttg 


agagagacat 


atctaatgtg. 


cctttctc'cc 


22860 


ctgatggcaa 


accttgcacc 


ccacctgctc 


ttaattgtta 


ttggccatta* 


aatgattatg 


22920 


gtttttacac 


cactactggc 


attggctacc 


aaccttacag 


agttgtagta 


ctttcttttg 


22980 


aacttttaaa 


tgcaccggcc 


acggtttgtg 


gaccaaaatt 


atccactgac 


cttattaaga 


23040 


accagtgtgt 


caattttaat 


tttaatggac 


tcactggtac 


tggtgtgtta 


actccttctt 


23100 


caaagagatt 


tcaaccattt 


caacaatttg 


gccgtgatgt 


ttctgatttc 


actgattccg 


23160 


ttcgagatcc 


taaaacatct 


gaaatattag 


acatttcacc 


ttgcgctttt 


gggggtgtaa 


23220 


gtgtaattac 


acctggaaca 


aatgcttcat 


ctgaagttgc 


tgttctatat 


caagatgtta 


23280 


actgcactga 


tgtttctaca 


gcaattcatg 


cagatcaact 


cacaccagct 


tggcgcatat 


23340 


attctactgg 


aaacaatgta 


ttccagactc 


aagcaggctg 


tcttatagga 


gctgagcatg 


23400 


tcgacacttc 


ttatgagtgc 


gacattccta 


ttggagctgg 


catttgtgct 


agttaccata 


' 23460 


cagtttcttt 


attacgtagt 


actagccaaa 


aatctattgt 


ggcttatact 


atgtctttag 


23520 


gtgctgatag 


ttcaattgct 


tactctaata 


acaccattgc 


tatacctact 


aacttttcaa 


23580 


ttagcattac 


tacagaagta 


atgcctgttt 


ctatggctaa 


aacctccgta 


gattgtaata 


23640 


tgtacatctg 


cggagattct 


actgaatgtg 


ctaatttgct. 


tctccaatat 


ggtagctt-tt 


23700 


gcacacaact 


aaatcgtgca 


ctctcaggta 


ttgctgctga 


acaggatc'gc 


aacacacgtg 


23760 


aagtgttcgc 


tcaagtcaaa 


caaatgtaca 


aaaccccaac 


tttgaaatat 


tttggtggtt 


23820 


ttaatttttc 


apaaatatta. 


cctgaccctc 


taaagccaac 


taagaggtct 


tttattgagg 


23880 


acttgctctt 


taataaggtg 


acactcgctg 


atgctggctt 


catgaagcaa 


tatggcgaat 


23940 


gcctaggtga 


tattaatgct 


agagatctca 


tttgtgcgca 


gaagttcaat 


ggacttacag 


24000 


tgttgccaqc 


tctgctcact 


gatgatatga 


ttgctgccta 


cactgctgct 


ctagttagtg 


. 24060 


gtactgccac 


tgctggatgg 


acatttggtg 


ctggcgctgc 


tcttcaaata 


ccttttgcta 


24120 


tgcaaatggc 


atataggttc 


aatggcattg 


gagttaccca 


aaatgttctc 


tatgagaacc 


. 24180 


aaaaacaaat 


cgccaaccaa 


tttaacaagg 


cgattagtca 


aattcaagaa 


tcacttacaa 


24240 


caacatcaac 


tgcattgggc 


aagctgcaag 


acgttgttaa 


ccagaatgct 


caagcattaa 


24300 


acacacttgt 


taaacaactt 


agctctaatt 


■ttggtgcaat. 


ttcaagtgtg 


ctaaatgata 


24360 


tcctttcgcg 


acttgataaa 


gtcgaggcgg 


aggtacaaat 


tgacaggtta 


•attacaggca 


24420 
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gacttcaaag 


ccttcaaacG 


tatgtaacac 


aacaactaat 


cagggtJtgct 


gaaatcaggg 


24480 


cttctgctaa tcttgctgct 


actaaaatgt 


ctgagtgtgt tcttggacaa tcaaaaagag 


24540 


ttgacttttg tggaaagggc taccacctta tgtccttccc 


acaagcagcc 


ccgcatggtg 


24600 


ttgtcttcct 


acatgtcacg 


tatgtgccat 


cccaggagag 


gaacttcacc 


acagcgccag 


24660 


caatttgtca 


tgaaggcaaa 


gcatacttcc ctcgtgaagg 


tgtttttgtg 


tttaatggca 


24720 


cttcttggtt tattacacag aggaacttct tttctccaca aataattact 


acagacaata 


24780 


catttgtctc 


aggaaattgt 


gatgtcgtta ttggcatcat 


taacaacaca 


gtttatgatc 


24840 


ctctgcaacc tgagcttgac 


tcattcaaag 


aagagctgga 


caagtacttc 


aaaaatcata 


24900 


catcaccaga 


tgttgatctt 


ggcgacattt 


caggcattaa 


cgcttctgtc 


gtcaacattc 


24960 


aaaaagaaat 


tgaccgcctc 


aatgaggtcg* ctaaaaattt 


aaatgaatca 


ctcattgacc 


25020 


ttcaagaatt 


gggaaaatat 


gagcaatata 


ttaaatggcc 


ttggtatgtt. 


tggctcggct 


25080 


tcattgctgg 


actaattgcc 


atcgtcatgg 


ttacaatctt 


gctttgttgc 


atgactagtt 


25140 


gttgcagttg 


cctcaagggt 


gcatgctctt 


gtggttcttg 


ctgcaagttt 


gatgaggatg 


25200 

4^ \j \j 


actctgagcc 


agttctcaag 


ggtgtcaaat 


tacattacac 


ataaacgaac 


ttatggattt 


252 60 


gtttatgaga 


ttttttactc 


ttagatcaat 


tactgcacag 


ccagtaaaaa 


ttgacaatgc 


25320 

«m/ w 


ttctcctgca agtactgttc afcgctacagc 


aacgataccg 


ctacaagcct 


cactcGcttt 


25380 


cggatggctt 


gttattggcg 


ttgcatttct 


tgctgttttt 


cagagcgcta 


ccaaaataat 


25440 


tgcgctcaat 


aaaagatggc 


agctagccct 


ttataagggc 


ttccagttca 


tttgcaattt 


25500 


actgctgcta 


tttgttacca 


tctattcaca 


tcttttgctt 


gtcgctgcag 


gtatggaggc 


25560 


gcaatttttg 


tacctctatg 


ccttgatata 


ttttctacaa 


tgcatcaacg 


catgtagaat 


25620 


tattatgaga 


tgttggcttt 


gttggaagtg 


caaatccaag 


aacccattac 


tttatgatgc 


25680 


caactacttt 


gtttgctggc 


acacacataa. ctatgactac 


tgtataccat 


ataacagtgt 


25740 


cacagataca 


attgtcgtta 


ctgaaggtga 


cggcatttca 


acaccaaaac 


tcaaagaaga 


25800 


ctaccaaatt 


ggtggttatt 


ctgaggatag 


gcactcaggt 


gttaaagact 


atgtcgttgt 


25860 


acatggctat 


ttcaccgaag 


tttactacca 


gcttgagtct 


acacaaatta 


ctacagacac 


25920 


tggtattgaa 


aatgctacat 


tcttcatctt 


taacaagctt 


gttaaagacc 


caccgaatgt 


25980 


gcaaatacac 


acaatcgacg 


gctcttcagg 


agttgctaat 


ccagcaatgg 


atccaattta 


26040 


tgatgagccg 


acgacgacta 


ctagcgtgcc 


tttgtaagca 


caagaaagtg 


agtacgaact 


26100 


tatgtactca 


ttcgtttcgg 


aagaaacagg 


tacgttaata 


gttaatagcg 


tacttctttt 


26160 


tcttgctttc 


gtggtattct 


tgctagtcac 


actagccatc 


cttactgcgc 


ttcgattgtg 


26220 
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tgcgtactgc 


tgcaatattg ttaacgtgag. 


tttagtaaaa 


ccaacggttt 


acgtctactc 


26280 


gcgtgttaaa 


aatctgaact 


cttctgaagg 


agttcctgat 


cttctggtct 


aaacgaacta 


26340 


actattatta 


ttattctgtt tggaacttta 


acattgctta 


tcatggca.ga 


caacggtact 


"26400 


attaccgttg 


aggagcttaa 


acaactcctg 


gaacaatgga 


acctagtaat 


aggtttccta" 


26460 


ttcctagcct 


ggattatgtt 


actacaattt 


gcctattcta 


atcggaacag gtttttgt'ac 


26520 


ataataaagc 


ttgttttcct 


ctggctcttg 


tggccagtaa 


cacttgcttg* 


ttttgtgctt 


26580 


gctgctgtct 


acagaattaa 


ttgggtgact 


ggcgggattg 


cgattgcaat 


ggcttgtatt 


26640 


gtaggcttga 


tgtggcttag ctacttcgtt 


gcttccttca 


ggctgtttgc tcgtacccgc 


26100 


tcaatgtggt 


cattcaaccc 


agaaacaaac 


attcttctca 


atgtgcctct 


ccgggggaca 


26760 


attgtgacca 


gaccgctcat 


ggaaagtgaa 


cttgtcattg 


gtgctgtgat 


cattcgtggt 


268.20 


cacttgcgaa 


tggccggaca 


ctccctaggg 


cgctgtgaca 


ttaaggacct 


gccaaaagag 


26880 


atcactgtgg 


ctacatcacg 


aacgctttct 


tattacaaat 


taggagcgtc 


gcagcgtgta 


2694 0 


ggcactgatt 


caggttttgc 


tgcatacaac 


cgctaccgta 


ttggaaacta 


taaattaaat 


27000 


acagaccacg 


ccggtagcaa 


cgacaatatt 


gctttgctag 


tacagtaagt 


gacaacagat 


27060 


gtttcatctt 


gttgacttcc 


aggttacaat 


agoagagata 


ttgattatca 


ttatgaggac 


♦ 27120 


tttcaggatt 


gctatttgga 


atcttgacgt 


tataataagt 


tcaatagtga 


gacaattatt 


27180 


taagcctcta 


actaagaaga 


attattcgga 


gttagatgat 


gaagaaccta 


tggagttaga 


27240 


ttatccataa 


aacgaacatg 


aaaattattc 


tcttcctgac 


attgattgta 


tttacatctt 


27300 


gcgagctata 


tcactatcag* gagtgtgtta 


gaggtacgac 


t.gt.actacta 


aaagaacctt 


27360 


gcccatcagg 


aacatacgag, ggcaattcac 


catttcaccc 


tcttgctgac 


aataaatbtg 


27420 


cactaacttg 


cactagcaca 


cactttgctt 


ttgcttgtgc 


tgacggtact 


cgacatacct 


27480 


atcagctgcg 


tgcaagatca 


gtttcaccaa 


aacttttcat 


cagacaagag 


gaggttcaac 


27540 


aagagctcta. 


ctcgccactt 


tttctcattg 


ttgctgctct 


agtattttta 


atactttgct 


27600 


tcaccattaa 


gagaaagaca 


gaatgaatga 


gctcacttta 


attgacttct 


atttgtgctt 


27660 


tttagccttt 


ctgctattcc 


ttgttttaat 


aatgcttatt 


atattttggt 


tttcactcga 


. 27720 


aatccaggat 


ctagaagaac 


cttgtaccaa 


agtctaaacg 


aacatgaaac 


ttctcattgt 


27780 


tttgacttgt 


atttctctat 


gcagttgcat 


atgcactgta 


gtacagcgct 


gtgcatctaa 


. 27840 


taaacctcat 


gtgcttgaag 


atccttgtaa 


ggtacaacac 


taggggtaat 


acttatagca 


27900 


ctgctt'ggct 


ttgtgctcta 


ggaaaggttt 


taccttttca 


tagatggcac 


actatggttc 


27960 


aaacatgcac 


acctaatgtt 


actatcaact 


gtcaagatcc 


agctggtggt 


gcgcttatag 


28020 


ctaggtgttg 


gtaccttcat 


gaaggtcacc 


aaactgctgc 


atttagagac 


gtacttgttg 


28080 
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ttttaaataa- 


acgaacaaat taaaatgtct gataatggac cccaatcaaa ccaacgtagt 


28140 


gccccccgca 


ttacatttgg 


tggacccaca 


gat-tcaactg 


acaataacca 


gaatggagga 


28200 

\j c-t \j \J 


cgcaatgggg 


caaggccaaa 


acagcgccga 


ccccaaggtt 


tacccaataa 


tactgcgtct 


28260 

^ w v-/ 


tggttcacag 


ctctcactca 


gcatggcaag 


gaggaactta 


gattccctcg 


aggccagggc 


28320 

£j \J W 


gttccaatca acaccaatag tggtccagat 


gaccaaattg 


gctactaccg 


aagagctacc 


£, O J Q \J 


cgacgagttc 


gtggtggtga 


cggcaaaatg 


aaagagctca 


gccccagatg 


gtacttctat 


?fl 4 4 n 

£t \J t t \J 


tacctaggaa 


ctggcccaga 


agcttcactt 


ccctacggcg ctaacaaaga 


aggcatcgta 




tgggttgcaa 


ctgagggagc 


cttgaataca 


cccaaagacc 


acattggcac 


ccgcaatcct 


£^ O U \J KJ 


aataacaatg 


ctgccaccgt 


gctacaactt 


cctcaaggaa 


caacattgcc 


aaaaggcttc 




tacgcagagg gaagcagagg cggcagtcaa 


gcctcttctc 


gctcctcatc 


acgtagtcgc 


9a fift n 


ggtaattcaa. 


gaaattcaac 


tcctggcagc 


agtaggggaa 


attctcctgc 


tcgaatggct 


9 ft 7 4 n 


•agcggaggtg 


gtgaaactgc 


•cctcgcgcta 


ttgctgctag acagattgaa 


ccagcttgag 


£^ iJ \J \J \J 


agcaaagttt 


ctggtaaagg 


ccaacaacaa 


caaggccaaa 


ctgtcactaa 


gaaatctgct 


^ O O u VJ 


gctgaggcat 


ctaaaaagcc tcgccaaaaa 


cgtactgcca 


caaaacagta 


caacgtcact 


o Q qr>n 

^ O 2? ^ u 


caagcatttg 


ggagacgtgg 


tccagaacaa 


acccaaggaa 


atttcgggga 


ccaagaccta 


^ 0 ^ 0 u 


atcagacaag 


gaactgatta 


caaacattgg 


ccgcaaattg 


cacaatttgc 


tccaagtgcc 


9 Qn4 n 

^ 7 *3 VJ 


tctgcattct 


ttggaatgtc 


acgcattggc 


atggaagtca 


caccttcggg 


aacatggctg 


9 Q1 no 


acttatcatg 


gagccattaa 


attggatgac 


aaagatccac 


aattcaaaga 


caacgtcata 


9<55 60 

^ 7 X U 


ctgctgaaca 


agcacattga 


cgcatacaaa 


acattcccac 


caacagagcc 


taaaaaggac 


2 9220 


aaaaagaaaa 


agactgatga 


agctcagcct 


ttgccgcaga 


gacaaaagaa 


gcagcccact 


29280 


gtgactcttc 


ttcctgcggc 


tgacatggat 


gatttctcca 


gacaacttca 


aaattccatg 


29340 


agtggagctt 


ctgctgattc 


aactcaggca. 


taaacactca 


tgatgaccac 


acaaggcaga 


29400 


tgggctatgt 


aaacgttttc 


gcaattccgt 


ttacgataca 


tagtctactc 


ttgtgcagaa 


29460 


tgaattctcg 


taactaaaca 


"gcacaagtag 


gtttagttaa 


ctttaatctc 


a.tia.Ua.yCSciL. 


29520 


ctttaatcaa 


tgtgtaacat 


tagggaggac 


ttgaaagagc 


caccacattt 


tcatcgaggc 


29580 


cacgcggagt 


acgatcgagg 


gtacagtgaa 


taatgctagg 


gagagctgcc 


tatatggaag 


29640 


agccctaatg tgtaaaatta 


attttagtag 


tgctatcccc 


atgtgatttt 


aatagcttct 


29700 


taggagaatg 


acaaaaaaaa 


aaaaaaaaaa 


aaaaaa 






29736 



<210> 3 
<211> 26 
<212> DNA 
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<213> Severe acute respiratory syndrome virus 
<400> 3 

tctctaaacg aactttaaaa tctgtg 

<210> 4 ■ 

<211>' 16 ' 
<212> Dm 

<213> • Severe acute respiratory syndrome virus 

<400> 4 

caactaaacg aacatg 

<210> 5 
<211> 18 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 5 

cacataaacg aacttatg 

<210> 6 
<211> 16 

-<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 6 

tgagtacgaa cttatg 

<210> 7 
<211> 18 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 7 

ggtctaaacg aactaact 

<210> 8 

<211> 11 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 8 . * 

aactataaat t 



<210> 9 
<211> 17 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 9 

tccataaaac gaacatg 



<210> 10 
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<211> 24 
<212> DNA- 

<213> Severe acute respiratory syndrome virus 
<400> 10 . 

tgctctagta tttttaatac ' tttg 



<210> 11 
<211> 16 
<212> DNA* 

<213> Severe acute respiratory syndrome virus 

<400> 11 

agtctaaacg aacatg 2.6 

<2i0> '12 ' 
<211> 15 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 12 • 

ctaataaacc tcatg 25 
<210> 13 . 

<211> 24 • . 

<212> DNA 

<213> Severe acute respiratory syndrome virus * 
<4dO> 13 

taaataaacg aacaaattaa aatg 24 

<210> 14 

<211> 20 ■ * 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 14 

Lys Thr Phe Pro Pro Thr Glu Pro Lys Lys Asp Lys Lys Lys Lys Thr 

1-5 .10 - lis 



Asp Glu Ala Gin 

20' 



<210> 15 
<211> 29751 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 15 

atattaggtt tttacctacc caggaaaagc caaccaacct cgatctcttg tagatctgtt 60 
ctctaaacga actttaaaat ctgtgtagct gtcgctcggc tgcatgccta gtgcacctac 120 
gcagtataaa caataataaa ttttactgtc gttgacaaga aacgagtaac tcgtccctct 180 
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tctgcagact ■ 


gcttacggtt 


tcgtccgtgt 


tgcagtcgat 


catcagcata 


cctaggtttc 


240 


gtccgggtgt 


gaccgaaagg 


taagatggag 


agccttgttc 


ttggtgtcea 


cgagaaaaca 


300 


cacgtccaac. 


-tcagtttgcc 


tgtccttcag 


gttagagacg 


tgctagtgcg 


tggcttcggg 


360 


gactctgtgg 


aagaggccct 


atcggaggca 


cgtgaacacc 


tcaaaaatgg 


cacttgtggt 


420 


ctagtagagc 


tggaaaaagg 


cgtactgccc 


cagcttgaac 


agccctatgt 


gttcattaaa 


480 


cgttctgatg 


ccttaagcac 


caatcacggc 


cacaaggtcg 


ttgagctggt 


tgcagaaatg 


540 


gacggcattc 


agtacggtcg 


tagcggtata 


acactgggag 


tactcgtgcc 


acatgtgggc 


600 


gaaaccccaa 


ttgcataccg 


caatgttctt 


cttcgtaaga 


acggtaataa 


gggagcccrgt: 


660 


ggtcaf agct 


atggcatcga 


tctaaagtct 


tatgacttag 


gtgacgagct 


tggcactgat 


720 


cccattgaag 


attatgaaca 


aaactggaac 


actaagcatg 


gcagtggtgc 


actccgtgaa 


78.0 


ctcactcgtg 


agctcaatgg 


aggtgcagtc 


actcgctatg 


tcgacaacaa 


tttctgtggc 


840 


ccagatgggt 


accctcttga 


•ttgcatcaaa 


gattttctcg 


cacgcQcagg 


caagtcaatg 


900 


tgcactcttt 


ccgaacaact 


tgattacatc 


gagtcgaaaa 


gaqqtat.c"ta 


ctgct-gccgt. 


960 


gaccatgagc 


atgaaattgc 


ctggttcact 


gaqcqctctg 


ataagagcta 


cgagcaccag 


1020 


acacccttcg 


aaattaagag 


tgccaagaaa 


tttgacactt 


tcaaaaaaga 


atgcccaaag 


• 1080 


tttgtgtttc 


ctcttaactc 


aaaagtcaaa 


gtcattcaac 


cacgtgttga 


aaagaaaaag 


1140 


actgagggtt 


tcatggggcg 


tatacgctct 


gtgtaccctg 


ttgcatctcc 


acaggagtot 


1200 


aacaatatgc 


acttgtctac 


cttgatgaaa 


tgtaatcatt 


gcgatgaagt' 


ttcatggcag 


1260 


acgtgcgact 


ttctgaaagc 


cacttgtgaa 


cattgtggca 


ctgaaaattt 


agttattgaa ' 


1320 


ggacctacta 


catgtgggta' 


cctacctact 


aatgctgtag 


tgaaaatgcc 


atgtcctgcc 


1380 


tgtcaagacc 


cagagattgg 


acctgagcat 


agtgttgcag 


attatcacaa 


ccactcaaac 


1440 


attgaaactc 


gactccgcaa 


gggaggtagg. 


actagatgtt 


ttggaggctg 


tgtgtttgcc 


1500 


tatgttggct 


gctataataa 


gcgtgcctac 


tgggttcctc 


gtgctagtgc 


tgatattggc 


1560 


tcaggccata 


ciiggcattac 


tggtgacaat 


gtggagacct 


tgaatgagga 


tctccttgag 


1620 


atactgagtc 


gtgaacgtgt 


taacattaac 


attgttggcg 


attttcattt 


gaatgaagag 


1680 


gttgccatc^ 


ttttggcatc 


tttctctgct 


tctacaagtg 


cctttattga 


cactataaag 


1740 


acrt Gt toal" t 


a ca a cf "h ol" f" 






gcggT:aaci.a 


T:aaagT.t.acc 


Xo UU 


aagggaaagc 


ccgtaaaagg 


tgcttggaac 


attggacaac 


agagatcagt 


tttaacacca 


1860 


ctgtgtggtt 


ttccctcaca 


ggctgctggt 


gttatcagat 


caatttttgc 


gcgcacactt 


1920 


gatgcagcaa 


accactcaat 


tcctgatttg 


caaagagcag 


ctgtcaccat 


acttgatggt 


1980 
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atttctgaac 


agtcattacg 


tcttgtcgac 


gccatggttt 


atacttcaga 


cctgctcacc 


2040 


aacagtgtca 


ttattatggc 


atatgtaact 


ggtggtcttg 


tacaacagac 


ttctcagtgg 


2100 


ttgtctaatc 


ttttgggcac 


tactgttgaa 


aaactcaggc 


ctatctttga 


atggattgag ' 


2160 


gcgaaactta 


gtgcaggagt 


tgaatttctc 


aaggatgctt 


gggagattct 


caaatttctc' 


2220 


attacaggtg 


tttttgacat 


cgtcaagggt 


caaatacagg 


ttgcttcaga. 


taacatcaag 


2'280 


gattgtgtaa 


aatgcttcat 


tgatgttgtt 


aacaaggcac 


tcgaaatgtg 


cattgatcaa 


2340 


gtcactatcg 


ctggcgcaaa 


gttgcgatca 


ctcaacttag 


gtgaagtctt 


catcgctcaa 


2400 


agcaagggac 


tttaccgtca 


gtgtatacgt 


ggcaaggagc 


agctgcaact 


actcatgcct 


2460 


cttaaggcac 


caaaagaagt 


aacctttctt 


gaaggtgatt 


cacatgacac 


agtacttacc 


2520 


tctgaggagg 


ttgttctcaa 


gaacggtgaa 


ctcgaagcac 


tcgagacgcc 


cgttgatagc 


25.80 


ttcacaaatg 


gagctatcgt 


tggcacacca 


gtctgtgtaa 


atggcctcat 


gctcttagag 


2640 


attaaggaca 


aagaacaata 


ctgcgcattg 


tctcctggtt 


tactggctac 


aaacaatgtc 


2700 


tttcgcttaa 


aagggggtgc 


accaattaaa 


ggtgtaacct 


ttggagaaga 


tactgtttgg 


2760 


gaagttcaag 


gttacaagaa 


tgtgagaatc 


acatttgagc 


ttgatgaacg 


tgttgacaaa 


2820 


gtgcttaatg 


aaaagtgctc 


tgtctacact 


gttgaatcog 


gtaccgaagt 


tactgagttt ' 


' 2880 


gcatgtgttg 


tagcagaggc 


tgttgtgaag 


actttacaac 


cagtttctga 


tctccttacc 


2940 


aacatgggta 


ttgatcttga 


tgagtggagt 


gtagctacat 


tctacttatt 


tgatgatgct 


3000 


ggtgaagaaa 


acttttcatc 


acgtatgtat 


tgttcctttt 


accct'ccaga 


tgaggaagaa 


3060 


gaggacgatg 


cagagtgtga 


ggaagaagaa 


attgatgaaa 


cctgtgaaca 


tgagtacggt. 


3120 


acagaggatg 


attatcaagg 


tctccctctg 


gaatttggtg 


cctcagctga 


aacagttcga 


3180 


gttgaggaag 


aagaagagga 


agactggctg 


gatgatacta 


ctgagcaatc 


agagattgag 


3240 


ccagaaccag 


aacctacacc 


tgaagaacca 


gttaatcagt 


ttactggtta 


tttaaaactt 


3300 


actgacaatg 


ttgccattaa 


atgtgttgac 


atcgttaagg 


aggcacaaag 


tgctaatcct 


3360 


atggtgattg 


taaatgctgc 


taacatacac 


ctgaaacatg 


gtggtggtgt 


agcaggtgca 


3420 


ctcaacaagg 


caaccaatgg 


tgccatgcaa 


aaggagagtg 


atgattacat 


taagctaaat , 


3480 


ggccctctta 


cagtaggagg 


gtcttgtttg 


ctttctggac 


ataatcttgc 


taagaagtgt 


3540 


ctgcatgttg 


ttggacctaa 


cctaaatgca 


ggtgaggaca 


tccagcttct 


taaggcagca 


3600. 


tatgaaaatt 


tcaattcaca 


ggacatctta 


cttgcaccat 


tgttgtcagc 


aggcatattt 


3660 


ggtgctaaac 


cacttcagtc 


tttacaagtg 


tgcgtgcaga 


cggttcgtac 


acaggtttat 


3720 


attgcagtca 


atgacaaagc 


tctttatgag 


caggttgtca 


tggattatct 


tgataacctg 


3780 


aagcctagag 


tggaagcacc 


taaacaagag 


gagccaccaa 


acacagaaga 


ttccaaaact 


3840 
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gaggagaaat. 


ctgtcgtaca 


gaagcctgtc 


gatgtgaagc 


caaaaattaa* 


qacctacatt 


3900 


gatgaggtta 


ccacaacact 


ggaagaaact 


aagt.-ttctt.a 


ccaataagt-t 


actcttgt.tt 


3960 


gctgatatca. atggtaagct 


ttaccatgat 


tctcagaaca 


tgcttagagg 


tgaagatatg 


4020 


tctttccttg 


agaaggatgc 


accttacatg 


gtaggtgatg 


ttatcactag 


tqqtaatatc 


4080 


acttgtgttg 


taataccctc 


caaaaaggct 


ggtggcacta ctgagatgct 


ctcaagagct 


4140 


ttgaagaaag 


tgccagttga 


tgagtatata 


accacgtacc 


ctggacaagg 


atgtgctggt 


4200 


tatacacttg 


aggaagctaa 


gactgctctt 


aagaaatgca 


aatctgcatt 


ttatQtacta 


4'260 


ccttcagaag 


cacctaatgc 


taaggaagag 


attctaggaa 


ctgtatcctg 


aaattt cracra 

^ s**. W O Cm* CA ti* 


4320 


gaaatgcttg 


ctcatgctga 


agagacaaga 


aaattaatgc 


ctatatgcat 


Qcfat att* aaa 


4380 


gccataatgg 


caaccatcca 


acgtaagtat 


aaaggaatta 


aaattcaaga 


gggcat cgtt 


' 4440 


gactatggtg. 


tccgattctt 


cttttatact 


agtaaagagc 


ctgtagcttc 


tattattaccf 


4500 


aagctgaact 


ctctaaatga 


gccgcttgtc 


acaatgccaa 


ttggttatgt 


oacacatcfcrt 


4560 


tttaatcttg 


aagaggctgc 


gcgctgtatg 


cgttctctta 


aagctcctgc 


cotaatatca 


4620 


gtatcatcac 


cagatgctgt 


tactacatat 


aatggatacc 


tcacttcgtc 


at caaa cr;5 r^a 

W V«i^ U>4 C-l d Wf d W' 


4 680 


tctgaggagc 


actttgtaga 


aacagtttct 


ttggctggct cttacagaga 


ttacrt net" at 


4740 


tcaggacagc 


gtacagagtt 


aggtgttgaa 


tttcttaagc "gtggtgacaa 


aatt Cft crtac 


4800 


cacactctgg 


agagccccgt 


cgagtttcat 


cttgacggtg aggttctttc 


acttaacaaa 


4860 


ctaaagagtc 


tcttatccct 


gcqgqaaa'tt 


aagactataa 


aagtgttcac 


aactgtggac 


4920 


aacactaatc 


tccacacaca 


gcttqtqqat 


atgtctatga 


catatggaca 


gcagt ttggt 


4980 


ccaacatact 


tggatggtgc 


tgatgttaca 


aaaattaaac 


ctcatgtaaa 


tcatgagggt 


5040 


aagactttct 


ttgtactacc 


tagtgatgac 


acactacgta 


gtgaagcttt 


cgagtactac 


5100 


catactcttg 


■ atgagagttt 


tcttggtagg 


. tacatgtctg 


ctttaaacca 


cacaaagaaa 


5160 


tggaaatttc 


ctcaagttgg 


tggtttaact 


tcaattaaat 


gggctgataa 


caattgttat 


5220 


ttgtctagtg ttttattagc 


acttcaacag 


cttgaagtca 


aattcaatgc 


accagcactt 


• 5280 


caagaggctt 


attatagagc 


ccgtgctggt 


gatgctgcta 


acttttgtgc 


actcatactc 


5340 


gcttacagta 


ataaaactgt 


tggcgagctt 


ggtgatgtca 


gagaaactat 


gacccatctt 


5400 


ctacagcatg 


ctaatttgga 


atctgcaaag 


cgagttctta 


atgtggtgtg 


taaacattgt 


5460 


ggtcagaaaa 


ctactacctt 


aacgggtgta 


gaagctgtga 


tgtatatggg 


tactctatct 


5520 


tatgataatc 


ttaagacagg 


tgtttccatt 


ccatgtgtgt 


gtggtcgtga 


tgctacacaa 


5580 


tatctagtac 


aacaagagtc 


ttcttttgtt 


atgatgtctg 


caccacctgc 


tgagtataaa 


5640 
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ttacagcaag 


gtacattctt 


• atgtgcgaat 


. gagtacactg 


gtaactatca 


gtgtggtcat 


5700 


tacactcata 


taactgctaa 


ggagaccctc 


tatcgtattg 


acggagctca 


ccttacaaag 


5760 


atgtcagagt 


acaaaggacc 


agtgactgat 


gttt.tctaca 


aggaaacatc 


ttacactaca " 


■ 5820 


accatcaagc 


ctgtgtcgta 


taaactcgat 


ggagttactt 


acacagagat 


tgaaccaaaa ' 


5880 


ttggatgggt 


attataaaaa 


ggataatgct 


tactatacag 


agcagcctat 


agaccttgta 


5'940 


ccaactcaac 


cattaccaaa 


tgcgagtttt 


gataatttca 


aactcacatg 


■ ttctaacaca 


6000 


aaatttgctg 


atgatttaaa 


tcaaatgaca 


ggcttcacaa 


agceagcttc 


acgagagcta 


6060 


tctgtcacat 


tcttcccaga' 


cttgaatggc 


gatgtagtgg 


ctattgacta 


tagacactat. . 


612P 


tcagcgagtt 


tcaagaaagg 


tgctaaatta 


ctgcataagc 


caattgtttg 


gcacattaac 


6180 


caggctacaa 


ccaagacaac 


gttcaaacca 


aacac'ttggt 


gtttacgttg 


tctttggagt 


6240 


acaaagccag 


tagatacttc 


aaattcattt 


gaagttctgg 


cagtagaaga 


cacacaagga 


6300- 


atggacaatc 


ttgcttgtga 


aagtcaacaa 


cccacciictg 


aagaagtagt 


ggaaaatcct 


6360 


accatacaga 


aggaagtcat 


agagtgtgac 


gtgaaaacta 


ccgaagttgt 


aggcaatgfcc 


6420 


atacttaaac 


catcagatga 


aggtgttaaa 


gtaacacaag 


agttaggtca 


tgaggatctt 


6480 


atggctgctt 


atgtggaaaa 


cacaagcatt 


accattaaga 


aacctaatga 


gctttcacta 


» 6540 


gccttjaggtt 


taaaaacaat 


tgccactcat 


ggtattgctg 


caattaatag 


tgttccttgg 


6600 


1. 

agtaaaattt 


tggcttatgt 


caaaccattc 


ttaggacaag 


cagcaattac 


aacatcaaat 


6660 


tgcgctaaga 


gattagcaca 


acgtgtgttt 


aacaattata 


tgccttatgt 


gtttacatta 


• 6720 


ttgttccaat 


tgtgtacttt 


tactaaaagt 


accaattcta 


gaattagagc 


ttcactacct 


6780 


^caactattg 


ctaaaaatag 


tgttaagagt 


gttgctaaat 


tatgtttgga 


tgccggcatt 


6840 


aattatgtga 


agtcacccaa 


attttctaaa 


ttgttcacaa 


tcgctatgtg 


gctattgttg. 


6900 


ttaagtattt 


gcttaggttc. tctaatctgt 


gtaactgctg 


cttttggtgt 


actcttatct 


• 6960 


aattttggtg 


ctccttctta 


ttgtaatggc 


gttagagaat 


tgtatcttaa 


ttcgtctaac 


7020 


gttactacta 


tggatttctg 


tgaaggttct 


tttccttgca 


gcatttgttt 


aagtggatta 


7080 


gactcccttg 


attcttatcc 


agctcttgaa 


accattcagg 


tgacgatttc 


atcgtacaag 


- 7140 


ctagacttga 


caattttagg tctggccgct 


gagtgggttt 


tggcatatat 


gttgttcaca 


7200 


aaattctttt 


atttattagg tctttcagct 


ataatgcagg 


tgttctttgg 


ctattttgct 


• 7260. 


agtcatttca 


tcagcaattc 


. ttggctcatg 


tggtttatca 


ttagtattgt 


acaaatggca 


7320 


cccgtttctg 


caatggttag 


gatgtacatc 


ttctttgctt 


ctttctacta 


catatggaag 


7380 


agctatgttc 


atatcatgga 


tggttgcacc 


tcttcgactt 


gcatgatgtg 


ctataagcgc 


7440 


aatcgtgcca 


cacgcgttga 


gtgtacaact 


attgttaatg 


gcatgaagag 


atctttctat 


7500 
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gtctatgcaa 


atggaggccg 


tggcttctgc 


aagactcaca 


attgg^attg 


tctcaattgt 


7560 


gacacatttt gcactggtag 


tacattcatt ■ 


agtgatgaag 


ttgctc'gtga 


tt"tgtcactc 


7 620' 


ca'gtttaaaa gaccaatcaa 


ccctactgac 


cagtcatcgt 


atattgttga 


tagtgttgct 


7680 


gtgaaaaatg 


gcgcgcttca 


actctacttt 


gacaaggctg 


gtcaaaagac 


ct'atgagag^ ■ 


7740 


catccgctct 


cccattttgt 


caatttagac 


aatttgagag 


ctaacaacac 


taaaggttca 


7800 


ctgpctatta 


atgtcatagt 


ttttgatggc 


aagtccaaat 


gcgacgagtc 


tgcttctaag 


7860 


tctgcttctg 


tgtactacag 


tcagctgatg 


tgccaaccta 


ttctgttgct 


tgaccaagct 


7920 


cttgtatcag 


acgttggaga 


tagtactgaa 


gtttccgtta 


agatgtttga 


tgcttatgtc 


7980 


gacacctttt 


cagcaacttt 


tagtgttcct 


atggaaaaac 


ttaaggcact 


tgttgctaca 


8040 


gctcacagcg 


agttagcaaa 


gggtgtagct' 


ttagatggtg 


tcctttctac 


attcgtgtca 


8100 


gctgcGcgac aaggtgttgt 


tgataccgat 


gttgacacaa 


aggatgttat 


tgaatgtctc 


8160 


aaactttcac 


atcactctga 


cttagaagtg 


acaggt gaca 


gttgtaacaa 


tttcatgctc 


8220 


acctataata 


aggttgaaaa 


catgacgccc 


agagatcttg 


gcgcatgtat 


"tgactgtaat. 


8280 


gcaaggcata 


tcaatgccca 


agtagcaaaa 


agtcacaatg 


tttcactcat 


ctggaatigta 


8340 


aaagactaca 


tgtctttatc 


tgaacaacta 

. Zi ^ IS 


cgtaaacaaa 


ttcgiiagtgG 


tgccaagaag 


8400 


aacaacatac 


cttttagact 


aacttgtgct 


acaactagac 


aggttgt caa 


tgtcataact 


* 8460 


actaaaatct 


cactcaaggg 


tggtaagatt 


gttagtactt 


gttttaaact 


tatgcttaag 


8520 


gccacatt&t 


tgtgcgttct 


tgctgcattg 


gtttgttata 


tcgttatgcc 


agtacataca 


8580 


ttgtcaatcc 


atgatggtta 


cacaaatgaa 


atcattggtt 


acaaagccat 


tcaggatggt 


8640 


gtcactcgtg 


acatcatttc 


tactgatgat 


tgttttgcaa 


ataaacatgc 


tggttttgac 


8700 


gcatggttta 


gccagcgtgg 


tggttcatac 


aaaaatgaca 


aaagctgccc 


tgtagtagct 


8760 


gctatcatta- 


caagagagat 


tggtttcata 


gtgcctggct 


taccgggtac 


tgtgctgaga 


8820 


gcaatcaatg 


gtgacttctt 


gcattttcta 


cctcgtgttt 


ttagtgctgt 


tggcaacatt 


8880 


tgctacacac 


cttccaaact 


cattgagtat 


agtgattttg 


ctacctctgc 


ttgcgttctt 


8940 


gctgctgagt 


gtacaatttt 


taaggatgct 


atgggcaaac 


ctgtgccata 


ttgttatgac 


9000 


actaatttgc 


tagagggttc 


tatttcttat 


agtgagcttc 


• gtccagacac 


tcgttatgtg 


9060 


cttatggatg 


gttccatcat 


cH-a y t. u t CC-X. 




tggagggttc 


tgttagagta . 




gtaacaactt 


ttgatgctga 


gtactgtaga 


catggtacat 


gcgaaaggtc 


agaagtaggt 


9180 


atttgcctat 


ctaccagtgg 


tagatgggtt 


cttaataatg 


agcattacag 


agctctatca 


9240 


ggagttttct 


gtggtgttga 


tgcgatgaat 


ctcatagcta 


acatctttac 


tcctcttgtg 


9300 
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caacctgtgg 


gtgctttaga tgtgtctgct . 


tcagtagtgg 


ctggtggtat 


tattgccata 


9360 


ttggtgactt 


gtgctgcc.ta .ctactttatg 


aaattcagac 


gtgtttttgg 


tgagtacaac 


9420 


catgttgttg' 


ctgctaatgc acttttgttt 


ttgatgtctt 


tcactatact 


ctgtctggta 


• 94 80 


ccagcttaca 


gctttctgcc gggagtctac tcagtctttt 


acttgtactt 


gacattctat * 


9540 


ttcaccaatg 


atgtttcatt cttggctcac 


cttcaatggt 


ttgccatgtt 


ttctcctatt 


9'600 


gtgccttttt 


ggataacagc aatctatgta 


ttctgtattt 


ctctgaagca • 


ctgccattgg 


9660 


ttctttaaca 


actatcttag gaaaagagtc 


atgtttaatg 


gagttacatt 


tagtaccttc 


9720 


gaggaggctg 


ctttgtgtac ctttttgctc 


aacaaggaaa 


tgtacctaaa 


attgcgtagc 


9180 


gagacactgt 


tgccacttac acagtataac 


aggtatcttg 


ctctatataa 


caagtacaag 


9840 


tatttcagtg 


gagccttaga tactaccagc tatcgtgaag 


cagcttgctg 


ccacttagca 


9900 


aaggctctaa 


atgactttag caactcaggt 


gctgatgttc 


tctaccaacc 


accacagaca 


9960- 


tcaatcactt 


ctgctgttct gcagagtggt 


tttaggaaaa 


tggcattccc 


gtcaggcaaa 


10020 


gttgaagggt 


gcatggtaca agtaacctgt 


ggaactacaa 


ctcttaatgg 


attgtggttg 


10080 


gatgacacag 


tatactgtCG aagacatgtc 


atttgcacag 


cagaagacat 


gcttaatcct 


10140 


aactatgaag 


atctgctcat tcgcaaatcc 


aac<:atagct 


ttcttgttca 


ggctggcaat 


'10200 


gttcaacttc 


gtgttattgg ccattctatg 


caaaattgtc 


tgcttaggct 


taaagttgat 


10260 


acttctaacc 


ctaagacaoc caagtataaa 


tttgtccgta 


tccaacctgg 


tcaaacattt 


10320 


tcagttctag 


catgctacaa tggttcacca 


tctggtgttt 


atcagtgtgc 


catgagacct 


10380 


aatcatacca 


ttaaaggttc tttccttaat 


ggatcatg-tg 


gtagtgttgg 


ttttaacat-t 


10440 


gattatgatt 


gcgtgtcttt ctgctatatg 


catcatatgg 


agcttccaac 


aggagtacac 


10500 


gctggtactg 


acttagaagg taaattctat 


ggtccatttg 


ttgacagaca 


aactgcacag.* 


10560 


gctgcaggta 


cagacacaac cataacatta 


aatgttttgg 


catggctgta 


tgctgctgtt 


10620 


atcaatggtg 


ataggtggtt tcttaataga 


ttcaccacta 


ctttgaatga 


ctttaacctt 


10680 


gtggcaatga 


agtacaacta tgaacctttg 


acacaagatc 


atgttgacat 


attgggacct 


10740 


ctttctgctc 


aaacaggaat tgccgtctta gatatgtgtg 


ctgctttgaa 


agagctgctg 


. 10800 


cagaatggta 


tgaatggtcg tactatcctt 


ggtagcacta 


ttttagaaga 


tgagtttaca 


10860 


ccatttgatg 


ttgttagaca atgctctggt 


gttaccttcc 


aaggtaagtt 


caagaaaatt 


. 10920 


gttaagggca 


ctcatcattg gatgctttta 


actttcttga 


catcactatt 


gattcttgtt 


10980 


caaagtacac 


agtggtcact gtttttcttt 


gtttacgaga 


atgctttctt 


gccatttact 


11040 


c-tt-gg-tatta 


t.ggcaat.tgc iigcatgtgct 


atgctgcttg 


ttaagcataa 


gcacgcattc 


11100 


ttgtgcttgt 


ttctgttacc ttctcttgca 


acagttgctt 


actttaatat 


■ggtctacatg 


11160 
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cctgctagct gggtgatgcg ta'tcatgaca tggcttgaat- tggctgacac tagcttgtct 


11220 


ggttataggc 


ttaaggattg 


tgttatgtat 


gcttcagott tagttttgct tattctcatg 


11280 


acagctcgca 


ctgtttatga 


tgatgctgct 


agacgtgttt 


ggacactgat 


gaatgtcatt 


11340 


acact'tgttt 


acaaagtcta 


ctatggtaat 


gctttagatc aagctatttc catgtgggcc 


11400 . 


ttagttattt ■ 


ctgtaa.cctc 


taactattct 


ggtgtcgtta 


cgactatcat 


gtttttagct 


11460 


agagctatag 


tgtttgtgtg 


tgttgagtat 


tacccattgt 


tatttattac 


tggcaacacc 


.11520 


ttacagtgta tcatgcttgt ttattgtttc ttaggctatt 


gttgctgctg ctactttggc 


11580 


cttttctgtt 


tactcaaccg 


ttacttcagg 


cttactcttg 


gtgtttatga 


ctacttggtc 


11640 


tctacacaag 


aatttaggta 


tatgaactcc 


caggggcttt 


tgcctcctaa 


gagtagtatt 


11700 


gatgctttca 


agcttaacat 


taagttgttg 


ggtattggag 


gtaaaccatg. tatcaaggtt 


11760 


gctactgtac 


agtctaaaat 


gtctgacgta 


aagtgcacat 


ctgtggtact 


gctctcggtt 


11820 


cttcaacaac 


ttagagtaga 


gtcatcttct 


aaattgtggg 


cacaatgtgt 


acaactcc'ac 


11880 . 


aatgatattc ttcttgcaaa 


agacacaact 


gaagctttcg 


agaagatggt 


ttctcttttg 


11940 


tctgttttgc 


tatccatgca 


gggtgctgta 


gacattaata 


ggttgtgcga 


ggaaatgctc 


12000 


gataaccgtg 


ctactcttca 


ggctattgct 


tcagaattta 


gttctttacc 


atcatatgcc 


12060 


gctt^tgcca ctgcccagga ggcctatgag caggctgtag ctaatggtga ttctgaagtc* 


' 12120 


gttctcaaaa 


agttaaagaa 


atctttgaat 


gtggctaaat 


ctgagtttga 


ccgtgatgct • 


12180 


gccatgcaac 


gcaagttgga 


aaagatggca 


gatcaggcta 


tgacccaaat 


gtacaaacag 


12240 


gcaagatctg 


aggacaagag 


ggcaaaagta 


actagtgcta 


tgcaaacaat 


gctcttcact 


1230,0 


atgcttagga 


agcttgataa 


tgatgcactt 


aacaacatta 


tcaacaatgc 


gcgtgatggt 


12360 


tgtgttccac 


tcaacatcat 


accattgact 


acagcagcca 


aactcatggt 


tgttgtccct 


12420 


gattatggta 


cctacaagaa 


cacttgtgat 


ggtaacacct 


ttacatatgc 


atctgcactc 


12480 


tgggaaatcc 


agcaagttgt 


tgatgcggat 


agcaagattg 


ttcaacttag 


tgaaattaac 


12540 


atggacaatt 


caccaaattt. 


ggcttggcct 


cttattgtta 


cagctctaag 


agccaactca 


12600 


gctgttaaac 


' tacagaataa 


tgaactgagt 


ccagtagcac 


tacgacagat 


gtcctgtgcg 


12660 


gctggtacca 


cacaaacagc ttgtactgat 


gacaatgcac 


ttgcctacta 


taacaattcg 


12720 


sagggaggta 


ggtttgtgct 


ggcattacta 


tcagaccacc 


aagatctcaa 


sttgggctaga 


T O T o r\ 


ttccctaaga 


gtgatggtac 


aggtacaatt 


tacacagaac 


tggaaccacc 


ttgtaggttt 


12840 


gttacagaca 


caccaaaagg 


gcctaaagtg 


aaatacttgt 


acttcatcaa 


aggcttaaac 


12900 


aacctaaata 


gaggtatggt 


gctgggcagt 


ttagctgcta 


cagtacgtct 


tcaggctgga 


12960 
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aatgctacag 


aagtacctgc 


caattcaact 


gtgctttcct 


tctgtgcttt 


tgcagtagac 


13020 


cctgctaaag 


catataagga 


ttacctagca 


agtggaggac 


aaccaatcac 


caactgtgtg 


13080 


aagatgttgt 


gtacacacac 


tggtacagga 


caggcaatta 


ctgtaacacc 


agaagctaac 


13140 


atggaccaag 


agtcctttgg 


tggtgcttca 


tgttgtctgt 


attgtagatg 


ccacattgac 


13200 


catccaaatc 


ctaaaggatt 


ctgtgacttg 


aaaggtaagt 


acgtccaaat 


acctaccact 


13260 


tgtgctaatg 


acccagtggg 


ttttacactt 


agaaacacag 


tctgtaccgt 


ctgcggaatg 


13320 


tggaaaggtt 


atggctgtag 


ttgtgaccaa 


ctccgcgaac 


ccttgatgca 


gtctgcggat 


13380 


gcatcaacgt 


ttttaaacgg 


gtttgcggtg 


taagtgcagc 


ccgtcttaca 


ccgtgcggca 


13440 


caggcactag 


tactgatgtc 


gtctacaggg 


cttttgatat 


ttacaacgaa 


aaagttgctg 


13500 


gttttgcaaa 


gttcctaaaa 


actaattgct 


gtcgcttcca 


ggagaaggat 


gaggaaggca 


13560 


atttattaga 


ctcttacttt 


gtagttaaga 


ggcatactat 


gtctaactac 


caacatgaag 


13620 


agactattta ' 


taacttggtt 


aaagattgtc 


cagcggttgc 


tgtccatgac 


tttttcaagt 


13680 


ttagagtaga 


tggtgacatg 


gtaccacata 


tatcacgtca 


gcgtctaact 


aaatacacaa 


13740 


tggctgattt- 


agtctatgct 


ctacgtcatt 


ttgatgaggg 


taattgtgat 


acattaaaag ' 


13800 


aaatactcgt 


cacatacaat 


tgctgtgatg 


atgattattt 


caataagaag 


gattggtatg 


13860 


acttcgtaga 


•gaatcctgac 


atcttacgcg 


tatatgctaa 


cttaggtgag 


cgtgtacgcc 


13920 


aatcattatt 


aaagactgta 


'caattctgcg 


atgctatgcg 


tgatgcaggc 


attgtaggcg 


13980 


tactgacatt 


agataatcag 


gatcttaatg 


ggaactggta 


cgatttcggt 


gatttcgtac 


14040 


aagtagcacc 


aggctgcgga 


gttcctattg 


tggattcata 


ttactcattg 


ctgatgccca 


14100 


tcctcacttt 


gactagggca 


ttggctgctg 


agtcccatat 


ggatgctgat .ctcgcaaaac 


14160 


cacttattaa 


gtgggatttg 


ctgaaatatg 


attttacgga 


agagagactt 


tgtctcttcg 


14220 


accgttattt 


taaatattgg 


gaccagacat 


accatcccaa 


ttgtattaac 


tgtttggatg 


14280 


ataggtgtat 


ccttcattgt 


gcaaacttta 


atgtgttatt 


ttctactgtg 


tttccaccta 


14340 


caagttttgg 


accactagta 


agaaaaatat 


ttgtagatgg 


tgttcctttt 


gttgtttcaa 


14400 


ctggatacca 


ttttcgtgag 


ttaggagtcg 


tacataatca 


ggatgtaaac ttacatagct 


14460 


cgcgtctcag 


tttcaaggaa 


cttttagtgt 


atgctgctga 


tccagctatg 


catgcagctt 


14520 


ctggcaattt 


attgctagat 


aaacgcacta 


catgcttttc 


agtagctgca 


ctaacaaaca 


14580 


atgttgcttt 


tcaaactgtc 


aaacccggta 


attttaataa 


agacttttat 


gactttgctg 


14640 


tgtctaaagg 


tttctttaag 


gaaggaagtt 


ctgttgaact 


aaaacacttc 


ttctttgctc 


14700 


aggatggcaa 


cgctgctatc 


agtgattatg 


actattatcg 


ttataatctg 


ccaacaatgt 


14760 


gtgatatcag 


acaactccta 


ttcgtagttg 


aagttgttga 


taaatacttt 


gattgttacg 


14820 
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atggtggctg tattaatgcc 


aaccaagtaa 


tcgttaacaa- 


tctggataaa 


tcagctggtt 


14880 


tcccatttaa taaatggggt 


aaggctagac 


tttattatga 


ctcaatgagt 


tatgaggatc 


14940 


aagatgcact 


tttcgcgtat 


actaagcgta 


atgtcatccc 


tactataact 


caaatgaatc 


15000 


ttaagtatgc cattagtgca 


aagaatagag 


ctcgcaccgt 


aqctgatgtc 


tctatctgta 


15060 


gtactatgac 


aaatagacag 


tttcatcaga 


aattattgaa 


gtcaatagcc 


crcoar't* aoaor 


15120 


gagctactgt 


ggtaattgga 


acaagcaagt 


tttacggtgg 


ctggcataat 


atcrttaaaaa 


15180 


ctgtttacag tgatgtagaa 


actccacacc 


ttatcracrtt a 


ggattatcca 


aa a t cri" era G a 


15240 


gagccatgcc 


taacatgctt 


aggataatgg 


cctctcttgt 


tcttgctcgc 


aaac;5l'aaca 


15300 


cttgctgtaa 


cttatcacac 


cgtttctaca 


ggttagctaa 


cgagtgtgcg 


caacrta tt aa 


15360 


gtgagatggt 


catgtgtggc 


ggctcactat 


atgttaaacc 


aggtggaaca 


tcatcnCTCTtcr 

V» M \ip* " " 13 


15420 


atgctacaac 


tgcttatgct 


aatagtgtct 


ttaacatttg 


tcaagctgtt 


acacrcnaat cr 


15480 


taaatgcact 


tctttcaact 


gatggtaata 


agatagctga 


caagtatgt c 


ccrcaai" ct Ac 


15540 


aacacaggct 


ctatgagtgt 


ctctatagaa 


atagggatgt. 


tgatcatgaa 




15600 


agttttacgc 


ttacctgcgt 


aaacatttct 


ccat gatgat 


"t cttt ctgat 


cra'hn'nr'Cft'i'n' 

^ u ^ <^ (_■ ^ U 


M ^ \J \J \J 
* • 


tgtgctataa 


cagtaactat 


•acaactcaaa 


gt'ttagtagc 


ta a c a "Ml a. a. a 

W wl W V» ViA WmL \^ 


aar'1"'l"t'aarrrr* 


•1. •hi / ^ VJ 


cagttcttta ttatcaaaat 


aatgtgttca 


tgtctgaggc 


a a aat* cf "t't era 


act cracrad" cr 


JL 'J f O \J 


accttactaa 


aggacctcac 


gaattttgct 


cacagcatac" 


aatcrct' acrtii 


aaapaacTcracT 


' 15840 


atgattacgt 


gtacctgcct 


taccca'gatc 


catcaagaat 


attaggcgca 


cfcrctcftitttcr 


15900 


tcgatgatat 


tgtcaaaaca 


gatggtacac 


ttatgattga 


aaggt t cgt g 


t cactggct a 


15960 


ttgatgctta 


cccacttaca 


aaacatccta 


atcaggagta 


t.actaatQt.c 


tttcacttgt 


16020 


atttacaata 


cattagaaag 


ttacatgatg 


agcttactgg 


ccacatgttg 


gacatgtatt 


16080 


ccgtaatgct 


aactaatgat 


aacacctcac 


ggtactggga 


acctgagtt:t: 


t atgaggcta 


16i40 


tgtacacacc 


acatacagtc 


ttgcaggctg 


taggtgcttg 


tgtattgtgc 


aattcacaga 


16200 


cttcacttcg 


ttgcggtgcG 


tgtattagga 


gaccattcct 


atgttgcaag 


"tgctgctatg 


16260 


accatgtcat 


ttcaacatca* 


cacaaattag 


tgttgtctgt 


taatccctat 


gtttgcaatg 


16320 


ccccaggttg 


tgatgtcact 


gatgtgacac 


aactgtatct 


aggaggtatg 


agctattatt 


16380 


gcaagtcaca 


taagcctccc 


attagttttc 


cattatgtgc 


taatggtcag 


gtttttggtt 


16440 


tatacaaaaa 


cacatgtgta 


ggcagtgaca 


atgtcactga 


cttcaatgcg 


atagcaacat 


16500 


gtgattggac 


taatgctggc 


gattacatac 


ttgccaacac 


ttgtactgag 


agactcaagc 


16560 


ttttcgcagc 


agaaacgctc 


aaagccactg 


aggaaacatt 


taagctgtca 


tatggtattg 


16620 
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ccactgtacg cgaagtactc tctgacagag aattgcettct tt'catgggag gttggaaaac -16680 

ctagaccacc attgaacaga aactatgtct ttactggtta ccgtgtaact aaaaatagta 16740 

aagtacagat tggagagtac acctttgaaa aaggtgacta tggtgatgct gttgtgtaca 16800 

gaggtactac gacatacaag ttgaatgttg gtgattactt tgtgttgaca tctcacactg 168 60 

taatgccact tagtgcacct actctagtgc cacaagagca ctatgtgaga attactggct 16920 

tgtacccaac actcaacatc tcagatgagt tttctagcaa tgttgcaaat tatcaaaagg 16980 

tcggcatgca aaagtactct acactccaag gaccacctgg .tactggtaag agtcattttg 17040 

ccatcggact tgctctctat tacccatctg ctcgcatagt gtatacggca tgctctcatg 17100 

cagctgttga tgccctatgt gaaaaggcat taaaatattt gcccatagat aaatgtagta 17160 

.gaatcatacc tgcgcgtgcg cgcgtagagt gttttgataa attcaaagtg aattcaacac 17220 

tagaacagta' tgttttctgc actgtaaatg cattgccaga aacaactgct gacattgtag 17280 

tctttgatga aatctctatg gctactaatt atgacttgag tgttgtcaat gctagacttc 17340 

gtgcaaaaca ctacgtctat attggcgatc ctgctcaatt accagccccc cgcacattgc 17400 

tgactaaagg cac'actagaa ccagaatatt ttaattcagt gtgcagactt ■ atgaaaacaa ' 174 60 

taggtccaga catgttcctt ggaacttgtc gccgttgtcc tgctgaaatt gttgacactg 17520 

tgagtgcttt . agtttatgac aataagctaa aagcacacaa ggataagtca gctcaatgct 17580 

tcaaaatgtt cta.caaaggt gttattacac atgatgtttc atctgcaatc aacagacctc 17 64 0 

aaataggcgt tgtaagagaa ttt'cttacac gcaatcctgc ttggagaaaa gctgttttta 17700 

tctcacctta taattcacag aacgctgtag ctt-caaaaaf 'Cttaggattg cctacgcaga 17760 

ctgttgattc atcacagggt tctgaatatg actatgtcat attcacacaa actactgaaa 17820 

cagcacactc ttgtaatgtc aaccgcttca atgtggctat cacaagggca aaaattggca 17 880 

ttttgtgcat aatgtctgat agagatcttt atgacaaact gcaatttaca agtctagaaa 17 940 

taccacgtcg caatgtggct acattacaag cagaaaatgt aactggactt tttaaggact 18000 

gtagtaagat cattactggt .cttcatccta cacaggcacc tacacacctc agcgttgata 18060 

taaagttcaa gactgaagga ttatgtgttg acataccagg cataccaaag gacatgacct" 18120 

accgtagact catctctatg atgggtttca- aaatgaatta ccaagtcaat ggttacccta . 18180 

atatgtttat cacccgcgaa gaagctattc gtcacgttcg tgcgtggatt ggctttgatg 18240 

tagagggctg tcatgcaact agagatgctg tgggtactaa cctacctctc cagctaggat 18300 

tttctacagg tgttaactta gtagctgtac cgactggtta tgttgacact gaaaataaca 18360 

cagaattcac cagagt-taat gcaaaacctc caccaggtga ccag-tttaaa catcttaicac 18420 

cactcatgta taaaggcttg ccctggaatg tagtgcgtat taagatagta caaatgctca 18480 
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gtgatacact gaaaggattg tcagacagag tcgtgttcgt 


cctttgggcg 


catggctttg 


18540 


agcttacatc aatgaagtac' tttgtcaaga ttggacctga 


aagaacgtgt. tgtctgtgtg 


18600 


acaaacgtgc 


aacttgcttt 


tctacttcat 


cagatactta 


tgcctgctgg 


aatcattctg 


18660 


tgggttttga 


ctatgtctat 


aacccattta 


tgattgatgt 


tcagcagtgg 


ggctttacgg 


18720 


gtaaccttca' 


gagtaaccat. 


gaccaacatt 


gccaiggtaca 


tggaaatgca 


catgtggcta 


18780 


gttgtgatgc 


tatcatgact 


agatgtttag 


cagtccatga 


gtgctttgtt 


aagcgcgttg 


18840 


attggtctgt tgaataccct attataggag atgaactgag 


ggttaattct 


gcttgcagaa 


18900 


aagtacaaca 


catggttgtg 


aagtctgcat 


tgcttgctga 


taagtttcca 


gttcttcatg 


18 960 


acattggaaa 


tccaaaggct 


atcaagtgtg 


tgcctcaggc 


tgaagtagaa 


tggaagttct 


19020 


acgatgctca 


gccatgt^gt 


gacaaagctt 


acaaaataga 


ggaactcttc. tattcttatg 


19080 


ctacacatca 


cgataaattc 


actgatggtg 


tttgtttgtt 


ttggaattgt 


aacgt-tgatc 


19140 


gttacccagc 


caatgcaatt 


gtgtgtaggt 


ttgacacaag 


agtcttgtca 


aacttgaatit 


19200 


taccaggctg tgatggtggt agtttgtatg tgaataagca 


tgcattccac actccagctt 


19260 


tcgataaaag 


tgcatttact 


aatttaaagc 


aattgccttt 


cttttactat 


tctgatagtc 


19320 
♦ - 


cttgtgagtc 


tcatggcaaa 


caagtagtgt 


cggatattga 


ttatgttcca 


ctcaaatctg 


19380 


ctacgtgtat 


tacacgatgc 


aatttaggtg 


gtgctgtttg 


cagacaccat 


gcaaatgagt 


• 19440 


accgacagta 


cttggatgca 


tataatatga 


tgatttctgc 


tggatttagc 


ctatggattt 


• 19500 


acaaacaatt 


tgatacttat 


aacctgtgga 


atacatttac 


caggttacag 


agtttagaaa 


195 60 


atgtggctta 


taatgttgtt 


aataaaggac 


actttgatgg 


acacgccggc 


gaagcacctg 


19620 


tttccatcat 


taataatgct 


gtttacacaa 


aggtagatgg 


tattgatgtg 


gagatctttg 


19680 


.aaaataagac 


aacacttcct 


gttaatgttg 


catttgagct 


ttgggctaag 


cgtaacatta 


19740 


aaccagtgcc 


agagattaag 


atactcaata 


atttgggtgt 


tgatatcgct 


gctaatactg 


19800 


taatctggga 


ctacaaaaga 


gaagccccag 


cacatgtatc 


tacaataggt 


gtctgcacaa 


19860 


tgactgacat 


tgccaagaaa 


cctactgaga 


gtgcttgttc 


ttcacttact 


gtcttgtttg 


19920 


atggtagagt 


ggaaggacag 


gtagaccttt 


ttagaaacgc 


ccgtaatggt 


gttttaataa 


19980 


cagaaggttc 


agtcaaaggt 


ctaacacctt 


caaagggacc 


agcacaagct 


agcgteaatg 


20040 


gagtcacatt 


aattggagaa 


tcagtaaaaa 


cacagtttaa 


ctactttaag 


aaagtagacg 




gcattattca 


acagttgcct 


gaaacctact 


ttactcagag 


cagagactta 


gaggatttta 


20160 


agcccagatc 


acaaatggaa 


actgactttc 


tcgagctcgc 


tatggatgaa 


ttcatacagc 


20220 


gatataagct 


cgagggctat 


gccttcgaac 


acatcgttta 


tggagatttc 


agtcatggac 


20280 
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aacttggcgg tcttcattta 


atgataggct tagccaagcg ctcacaagat 


tcaccactta- 


:20340 


aattagagga ttttatccct 


atggacagca 


cagtgaaaaa 


ttacttcata 


acagatgcgc 


20400 


aaacaggttc atcaaaatgt 


gtgtgttctg tgattgatct 


tttacttgat ' 


gactttgtcg 


204 60 


agataataaa gtcacaagat 


ttgtcagtga 


tttcaaaagt 


ggtcaaggtt 


acaattgact 


20520 


atgctgaaat ttcattcatg ctttggtgta aggatggaca tgttgaaacc ttctacccaa 


20580 


aactacaagc aagtcaagcg 


tggcaaccag 


gtgttgcgat 


gcctaacttg 


tacaagatgc 


20640 


aaagaatgct tcttgaaaag 


tgtgaccttc 


agaattatgg . 


tgaaaatgct 


gttataccaa 


20700 


aaggaataat gatgaatgtc 


gcaaagtata 


ctcaactgt.g 


tcaatactta 


aatacactta 


20760 


ctttagctgt accctacaac 


atgagagtta 


ttcactttgg 


tgctggctct 


gataaaggag 


20820 


ttgcaccagg tacagctgtg ctcagacaat ggttgccaac tggcacacta cttgtcgatt 


20880 


cagatcttaa tgacttcgtc 


tccgacgcag 


attctacttt 


aattggagac 


tgtgcaacag 


20940 


tacatacggc taataaatgg 


gaccttatta 


ttagcgatat 


gtatgaccct 


aggaccaaac 


21000 


atgtgacaaa agagaatgac 


tctaaagaag 


ggtttttcac 


ttatctgtgt 


ggatttataa 


21060 


agcaaaaact agccctgggt ggttctatag ctgtaaagat aacagagcat tcttggaatg 


21120 


ctgaccttta caagcttatg 


ggccatttct 


catggtggac 


agcttttgtt 


acaaatgtaa 


21180 


atgcatcatc • atcggaagca 


tttttaattg gggctaacta tcttggcaag ccgaaggaac 


. 21240 


aaattgatgg ctataccatg 


catgctaact 


acattttctg 


gaggaacaca 


aatcctatcc 


21300 


agttgtcttc ctattcactc 


tttgacatga 


gcaaatttcc 


tcttaaatta 


agaggaactg 


21360 


ctgtaatgtc tcttaaggag 


aatcaaatca 


atg^-tatgat ttattctctt 


ctggaaaaag 


21420 


gtaggcttat cattagagaa 


aacaacagag 


ttgtggtttc 


aagtgatatt 


cttgttaaca 


21480 


actaaacgaa catgtttatt 


ttcttattat 


ttcttactct 


cactagtggt 


agtgaccttg 


21540 


accggtgcac cacttttgat 


gatgttcaag 


ctcctaatta 


cactcaacat 


acttcatcta 


21600 


tgaggggggt ttactatcct 


gatgaaattt 


ttagatcaga 


cactctttat 


ttaactcagg 


21660 


atttatttct tccattttat 


tctaatgtta 


cagggtttca 


tactattaat 


catacgtttg 


21720 


gcaaccctgt catacctttt 


aaggatggta 


tttattttgc 


tgccacagag 


aaatcaaatg 


21780 


ttgtccgtgg ttgggttttt 


ggttctacca 


tgaacaacaa 


gtcacagtcg 


gtgattatta 


21840 


ttaacaattc tactaatgtt 


gttatacgag 


catgtaactt 


tgaattgtgt 


gacaaccctt 




tctttgctgt ttctaaaccc 


atgggtacac 


agacacatac 


tatgatattc 


gataatgcat 


21^60 


ttaattgcac • tttcgagtac 


atatctgatg 


ccttttcgct 


tgatgtttca 


gaaaagtcag 


22020 


gtaattttaa acacttacga 


gagtttgtgt 


t-taaaaataa 


agatgggttt 


ctctatg-ttt 


22080 


ataagggcta tcaacctata 


gatgtagttc 


gtgatctacc 


ttctggtttt 


aacactttga 


22140 
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aacctatttt • 


taagttgcct 


cttggtatta 


acattacaaa 


ttttagagcc 


attcttacag 


22200 


ccttttcacc 


tgctcaagac 


atttggggca • 


cgtcagctgc 


agcctatttt 


gttggctatt: 


22260 


taaagccaac 


tacatttatg 


ctcaagtatg 


atgaaaatgg 


tacaatcaca 


gatgctgttg 


22320 


attgttctca 


aaatccactt 


gctgaactca 


aatgctctgt 


taagagcttt 


gagattgaca ' 


22380 


aaggaattta 


ccagacctct 


aatttcaggg 


ttgttccctc 


sggagatgtt 


gtgaga'ttcc 


22440 


ctaatattac 


aaacttgtgt 


ccttttggag 


aggtttttaa 


tgctactaaa 


ttcccttctg 


22500 


tctatgcatg 


ggagagaaaa 


aaaatttcta 


attgtgttgc 


tgattactct 


cftcfCtctaGa 


22560 

V \A \J 


actcaacatt 


tttttcaacc 


tttaagtgct 


atggcgtttc 


tgccact aag 


ttoa at crate 


22 620 

£Li ^ \J c£_* \J 


tttgcttctc 


caatgtctat 


gcagattctt 


ttgtagtcaa 


QQcraQataat 


gtaagacaaa 


22680 

^ ^Cd W w 


tagcgccagjg 


acaaactggt 


gttattgctg 


attataatta 


taaattgcca 


crataatttca 


22140 


tgggttgtgt- 


ccttgcttgg 


aatactagga 


acattgatgc 


tacttcaact 


ggtaattata 


22800 


attataaata 


taggtatctt 


agacatggca 


agcttaggcc 


ctttgagaga 


gacatatcta 


22860 


atgtgccttt 


ctccGctgat 


ggcaaacctt 


gcaccccacc 


tgctctt-aat 


t ci 1 1 a 1 1 a cf c 


22920 


cattaaatga 


'ttatggtttt 


"tacaccacta 


ctqgcattag 


ctaccaacct 


t acagagtt g 


22980 


tagtactttc 


ttttgaactt 


ttaaatgcac 


cggccacggt 

1 


ttgtggacca 


aaattatcca 


23040 


ctgaccttat 


taagaaccag 


tgtgtcaatt 


ttaattttaa 


tggactcact 


ggt a ct ggt g 


23100 


tgttaactcc 


ttcttcaaag 


agatttcaac 


catttcaaca 


atttaacccrt 

^ w Vp## V*-* Xrf^ *^ 


cratatttcfca 

^3 ^ ^3 www w* L** 


23160 


atttcactga 


ttccgttcga 


gatcctaaaa 


catctgaaat 


attagacatt 


tcaccttgcg 


23220 


cttttggggg 


tgtaagtgta 


attacacctg 


.gaacaaatgc 


ttcatctgaa 


gttgctgttc 


23280 


tatatcaaga 


tgttaactgc 


actgatgttt 


ctacagcaat 


tcatgcagat 


caactcacac 


23340 


cagcttggcg 


catatattct 


actggaaaca 


atgtattcca 


gactcaagca 


ggctgtctta 


23400 


taggagctga 


gcatgtcgac 


acttcttatg 


agtgcgacat 


tcctattgga 


gctggcattt 


23460 


gtgctagtta 


ccatacagtt 


tctttattac 


gtagtactag 


ccaaaaatct 


attgtggctt 


23520 


atactatgtc 


tttaggtgct 


gatagttcaa 


t-tgcttactc 


taataacacc 


attgctatac 


23580 


ctactaactt 


ttcaattagc 


attactacag 


aagtaatgcc 


tgtttctatg 


gctaaaacct 


23640 


ccgtagattg 


taatatgtac 


atctgcggag 


attctactga 


atgtgctaat 


ttgcttctcc 


23700 


aatatggtag 


ct t ttcrr'ar'a 

W Via \» V> k« U v.* CI \^ t» 


»-» Q O L. a d CL L. V-^ 




aggua u ugc^ 


gctgaacagg 


O O T «r A 


atcgcaacac 


acgtgaagtg 


ttcgctcaag 


tcaaacaaat 


gtacaaaacc 


ccaactttga 


23820 


aatattttgg 


tggttttaat 


ttttcacaaa 


tattacctga 


ccctctaaag 


ccaactaaga 


23880 


ggtcttttat 


tgaggacttg 


ctctttaata 


aggtgacact 


cgctgatgct 


ggcttcatga 


23940 
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agoaatatgg cgaatgccta ■ ggtgatatta atgctagaga tctcatttgt gcgcagaagt 24000 

tcaatggact tacagtgttg ccacctctgc tcactgatga tatgattgct gcctacactg 24060 

ctgctctagt tagtggtact gccactgctg gatggacatt tggtgctggc gctgctcttc ■ 24120 

aaataccttt tgctatgcaa atggcatata ggttcaatgg cattggagtt acccaaaatg' 24180 

ttctctatga gaaccaaaaa caaatcgcca accaatttaa caaggcgatt agtcaaattc 24240 

aagaatcact tacaacaaca tcaactgcat tgggcaagct gcaagacgtt gttaaccaga 24300 

atgctcaagc attaaacaca cttgttaaac aacttagctc taattttggt gcaatttcaa 24360 

gtgtgctaaa tgatatcctt tcgcgacttg ataaagtcga ggcggaggta caaattgaca 24420 

ggttaattac aggcagactt caaagccttc aaacctatgt aacacaacaa ctaatcaggg 24480 

ctgctgaaat cagggcttct gctaatcttg ctgctactaa aatgtctgag tgtgttcttg 245.40 

gacaatcaaa aagagttgac ttttgtggaa agggctacca ccttatgtcc ttcccacaag 24 600 

cagccccgca tggtgttgtc ttcctacatg^ tcacgtatgt gccatcccag gagaggaact 24 660 

tcaccacagc gccagcaatt tgtcatgaag gcaaagcata cttccctcgt gaaggtgtfet 24720 

t'tgtgtttaa tggcacttct tggtttatfca cacagaggaa cttcttttct ccacaaataa 24780 

ttactacaga caatacattt gtctcaggaa attgtgatgt cgttattggc atcattaaca * 24840 

acacagttta tgatcctctg caacctgagc ttgactcatt caaagaagag ctggacaagt 24 900 

acttcaaaaa tcatacatca ccagatgttg atcttggcga catttcaggc attaacgctt 24 960 

ctgtcgtcaa cattcaaaaa gaaattgacc gcctcaatga ggtcgctaaa aatttaaatg 25020 

aatcactcat tgaccttcaa gaattgggaa aatatgagca atatattaaa tggccttggt. 25080 

atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 25140 

gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 25200 

agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa 25260 

cgaacttatg gatttgttta tgagattttt tactcttaga tcaattactg cacagccagt 25320 

aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca 25380 

agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag . 25440 

cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 25500 

gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc .25560 

tgcaggtatg gaggcgcaat ttttgtacct ctatgccttg atatattttc tacaatgcat 25620 

caacgcatgt agaattatta tgagatgttg gctttgttgg aagtgcaaat ccaagaaccc 25680 

attactttat gatgccaact acttt.gtttg ctggcacaca cataactatg actactgtat 257 40 

accatataac agtgtcacag atacaattgt cgttactgaa ggtgacggca tttcaacacc 25800 
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aaaactcaaa, gaagactacG aaattggtgg 
- agactatgtc gttgtacatg gctatttcac 
aattactaca gacactggta ttgaaaatgc 
agacccaccg aatgtgcaaa izacacacaat 
aatggatcca atttatgatg agccgacgac 
aagtgagtac gaacttatgt actcattcgt 
tagcgtactt ctttttcttg ctttcgtggt 
tgcgcttcga ttgtgtgcgt actgctgcaa 
ggtttacgtc tactcgcgtg ttaaa'aatct 
ggtctaaacg aactaactat t'attattatt 
gcagacaacg gtactattac cgttgaggag 
gtaataggtt tcctattcct agcctggatt 
aacaggtttt tgtacataat aaagcttgtt 
gcttgttttg tgcttgctgc tgtctacaga 
gcaatggctt .gtattgtagg cttgatgtgg 
t-ttgctcgta cccgctcaat gtggtcattc 
cctctccggg ggacaattgt gaccagaccg 
gtgatcattc gtggtcactt gcgaatggcc 
gacctgccaa aagagatcac tgtggctaca 
gcgtcgcagc gtgtaggcac tgattcaggt 
aactataaat taaatacaga ccacgccggt 
taagtgacaa cagatgtttc atcttgttga. 
tatcattatg aggactttca ggattgctat 
agtgagacaa ttatttaagc ctctaactaa 
acctatggag ttagattatc cataaaacga 
ttgtatttac atcttgcgag ctatatcact 
tactaaaaga accttgccca tcaggaacat 
ctgacaataa atttgcacta acttgcacta 
gtactcgaca tacctatcag ctgcgtgcaa 
aagaggaggt tcaacaagag ctctactcgc 



PCT/CA2004/000626 

ttattctgag gatag^fcact caggtgttaa 25860 

cgaagtttac taccagcttg agtctacaca 25920 

tacattcttc atct'ttaaca agcttgttaa 25980 

cgacggctct tcaggagttg ctaatccagc 26040 

gactactagc gtgcctttgt aagcacaaga 26100 

ttcggaagaa acaggtacgt taatagttaa 26160 

attcttgcta gtcacactag ccatccttac 26220 

tattgttaac- gtgagtttag taaaaccaac 26280 

gaactcttct gaaggagttc ctgatcttct 26340 

ctgtttggaa ctttaacatt gcttatcatg 2 6400 

cttaaacaac tcctggaaca atggaaccta 26460 

atgttactac aatttgccta ttctaatcgg 26520 

ttcctctggc tcttgtggcc agtaacactt 26580 

attaattggg tgactggcgg gattgcgatt 26640 

cttagctact tcgttgcttc cttcaggctg 26700 

aacccagaaa caaacattct tctcaatgtg 26760 

ctcatggaaa gtgaacttgt cattggtgct 26820 

ggacactccc tagggcgctg tgacatta.ag 26880 

tcacgaacgc tttcttatta caaattagga 26940 

tttgctgcat acaaccgct'a ccgtattgga 27000 

agcaacgaca atattgcttt gctagtacag 27060 

cttccaggtt acaatagcag agatattgat 27120 

ttggaatctt gacgttataa taagttcaat 27180 

gaagaattat tcggagttag atgatgaaga 2724 0 

acatgaaaat tattctcttc ctgacattga 27300 

atcaggagtg tgttagaggt acgactgtac 27360 

acgagggcaa ttcaccattt caccctcttg 27420 

gcacacactt tgcttttgct tgtgctgacg 27480 

gatcagtttc accaaaactt ttcatcagac 27540 

cactttttct cattgttgct gctctagtat 27 600 
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ttttaatact 


ttgcttcacc 


attaagagaa . 


agacagaatg 


aatgagctca 


ctttaattga 


27660 


cttctatttg 


tgctttttag 


.cctttctgct 


attccttgtt 


ttaataatgc 


ttattatatt 


27720 


ttggttttca 


ctcgaaatcc 


aggatctaga 


agaJaccttgt 


accaaagtpt 


aaacgaacat 


•"27780 


gaaacttctc 


attgttttga 


cttgtatttc 


tctatgcagt 


tgcatatgca 


ctgtagtaca' 


27840 


gcgctgtgca 


tctaataaac 


ctcatgtgct 


tgaagatcct 


tgtaaggtac 


aacactaggg 


27900 


gtaatactta 


tagcactgct 


tggctttgtg 


ctctaggaaa 


ggttttacct • 


tttcatagat 


27960 


ggcacactat 


ggttcaaaca 


tgcacaccta 


atgttactat 


caactgtcaa 


gatccagctg 


28020 


gtggtgcgct 


tatagctagg 


tgttggtacc 


ttcatgaagg 


tcaccaaac-t 


gctgcattta 


28080 


gagacgtact 


tgttgtttta 


aataaacgaa 


caaattaaaa 


tgtctgataa 


tggaccccaa 


28140 


tcaaaccaac 


gtagt^cccc 


ccgcattaca 


tttggtggac 


ccacagattc 


aactgacaat 


28200 


aaccagaatg 


gaggacgcaa 


tggggcaagg 


ccaaaacagc 


gccgacccca 


aggtttaccc 


28260 


aataatactg 


cgtcttggtt 


cacagctctc 


actcagcatg 


gcaaggagga 


acttagattc 


28320 


cctcgaggcc 


agggcgttcc 


aatcaacacc 


aatagtggtc 


cagatgacca 


aattggctac 


28380 


taccgaagag 


ctacccgacg 


agttcgtggt 


ggtgacggca 


aaatgaaaga 


gctcagcccc 


28440 


agatggtact 


tctattacct 


aggaactggc 


ccagaagctt 


cacttcccta 


cggcgctaac 


' 28500 


"aaagaaggca • 


tcgtatgggt 


tgcaactgag 


ggagccttga 


atacacccaa 


agaccacatt 


28560 


ggcacccgca 


atcctaataa 


caatgctgcc 


accgtgctac 


aacttcctca 


aggaacaaca 


28620 


ttgccaaaag 


gcttctacgc 


agagggaagc 


agaggcggca 


gtcaagcctc 


ttctcgctcc 


28680 


tcatcacgta 


gtcgcggtaa 


ttcaagaaat. 


•tcaactcctg 


gcagcag-tag 


gggaaattct 


287 40 


(pctgctcgaa 


tggctagcgg 


aggtggtgaa 


actgccctcg 


cgctattgct 


gctagacaga 


28800 


ttgaaccagc 


ttgagagcaa 


agtttctggt 


aaaggccaac 


aacaacaagg 


ccaaactgtc 


28860 


actaagaaat 


ctgctgctga. 


ggcatctaaa 


aagcctcgcc 


aaaaacgtac 


tgccacaaaa 


28920 


cagtacaacg 


tcactcaagc 


atttgggaga 


cigtggtccag 


aacaaaccca 


aggaaatttc 


28980 


ggggaccaag 


acctaatcag 


acaaggaact 


gattacaaac 


attggccgca 


aattgcacaa 


2904 0 


tttgctccaa 


gtgcctctgc 


attctttgga 


atgtcacgca 


ttggcatgga 


agtcacacct 


. 29100 


tcgggaacat 


ggctgactta 


tcatggagcc 


attaaattgg 


atgacaaaga 


tccacaattc 


29160 


aaagacaacg 


tcatactgct 


gaacaagcac 


attgacgcat 


acaaaacatt 


cccaccaaca 


. 29220 


gagcctaaaa 


aggacaaaaa 


gaaaaagact 


gatgaagctc 


agcctttgcc 


gcagagacaa 


29280 


aagaagcagc 


ccactgtgac 


tcttcttcct 


gcggctgaca 


tggatgattt 


ctccagacaa 


29340 


cttcaaaatt 


ccatgagtgg 


agcttctgct 


gattcaactc 


aggcataaac 


actcatgatg 


29400 


accacacaag 


gcagatgggc 


tatgtaaacg 


ttttcgcaat 


tccgtttacg 


atacatagtc 


29460 
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tactcttgtg cagaatgaat tctcgtaact aaacagcaca agtagigttta gttaacttta 29520 

atctcacata gcaatctt-ta atcaatgtgt' aacattaggg aggacttgaa agagccacca 29580 

cattttcatc -gaggccacgc ggagtacgat cgagggtaca gtgaataatg ctagggagag 29640 

ctgcctatat ggaagagccc taatgtgtaa aattaatttt agtagtgcta tccccatgtg 29700 

attttaatag cttcttagga gaatgacaaa aaaaaaaaaa aaaaaaaaaa a 29751 

<210>. 16 ■ 

<211> 47 

<212> DMA 

<213> Severe acute respiratory syndrome virus 

<400> 16 



acattttcat cgaggccacg cggagtacga, tcgagggtac agtgaat 




47 


<211> 32 
<212> DNA 

<213> Severe acute respiratory syndrome virus 






<400> 17 

cgaggccacg cggagtacga tcgagggtac ag 




32 


<210> 18 1 
<211> ^ 339 ■ . 
<212> " DNA . ' . . 
<213> Severe acute respiratory syndrome virus 






<400> 18 

acactcatga tgaccacaca aggcagatgg gctatgtaaa' -cgttttcgca 


attccgttta 


60 


cgatacatag tctactcttg tgcagaatga attctcgtaa ctaaacagca 


caagtaggtt 


120 


tagttaactt taatctcaca tagcaatctt taatcaatgt gtaacattag 


ggaggacttg 


180 


aaagagccac cacattttca tcgaggccac gcggagtacg atcgagggta 


cagtgaataa 


240 


tgctagggag agctgcctat atggaagagc cctaatgtgt aaaattaatt 


ttagtagtgc 


300 


tatccccatg tgattttaat agcttcttag gagaatgac 




339 



<210> 19 

<211> 35 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> s2m motif 



<220> 

<221> misc_f eature 
<222> (5) . . (5) 
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<223> n is a, c, g, or t 

<220> . 

<221> misc_feature 

<222> (23) . ; (23) 

<223> n is a, o, qf or t 

<400> 19 

gccgnggcca cgcsgagtas gancgagggt acags 



<210> 20 
<211> 26 
<212> RNA 

<213> ' Severe acute respiratory syndrome virus 

<400> 20 

ucucuaaacg aacuuuaaaa ucugug 



<210> 21 
<211> 16 
<212> RNA 

<213> Severe acute respiratory syndrome' virus 
<400> 21 

caacuaaacg aacaug . 



<21Q>. 22 . . 
<2ai> 18 ' 
<212'?^' RNA 

<213> Severe acute respiratory syndrome virus 
<400> 22 

cacauaaacg aacuuaug 



<210> 23 

<211> 16 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

<400> 23 
ugaguacgaa cuuaug 



<210> 24 
<211> 18 
<212> RNA 

<213> Severe acute respiratory syndrome virus 
<400> 24 

ggucuaaacg aacuaacu 



<210> 25 

<211> 11 

<212> RNA 

<213> Severe acute respiratory syndrome virus 
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<:400> 25 
aacuauaaau u 



ri 



<210> 26 

<211> 17 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

<400> 26 " , . - 

uccauaaaac gaacaug 17 

<210> 27 .. ■ 

<211> -.24 

<212> RNA 

<213> Severe acute respiratory syndrome virus 



<210> 28 

<211> 16 • ' " ■ ' 

<212> RNA 

<2I3> Severe acute respiratory syndroxae virus 

<400> 28 " * 

agucuaa'acg aacaug • * '16 

<210^?'" 29 

<211> 15 

<212> RNA 

<213> Severe acute respiratory syndrome virus 

• <400> 29 ' ' 

cuaauaaacc ucaug • 15 

<210> 30 

<211> 24 • • 

<212> RNA 

<213> Severe acute respiratory syndrome virus 



<210> 31 • 
<211> 136 
<2.12> DNA 

<213> Equine rhinovirus 
<400> 31 

acccgttacc ctaaaattcc ctcccctttc tcttcactcg ccgaggccac gccgagtagg 60 

accgagggta, cagcgagtct tttagtttaa ggtgttagat gtaaggtacg tgggctttct 120 
tttggtttac ttcttc 136 



<400> 27 

ugcucuagua uuuuuaauac uuug 



24 



<400> 30 

uaaauaaacg aacaaauuaa aaug 



24 
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<210> 32 • ■ - 

<211> 178 
<212> -Dm 

<213> Avian infectious bronchitis 

•<400> 32 . 

tagtttagtt taagttagtt tagagtaggt ataaagatgc cagtgccggg gccacgcgga 60 
gtacgatcga gggtacagca ctaggacgcc cattagggga agagctaaat tttagtttaa 120 
gttaagttta attggctaag tatagttaaa atttataggc tagtatagag ttagagca ' 178 



<210> 33 

<211> 1255 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 33 

Met Phe lie Phe Leu Leu Phe Leu Thr Leu Thr Ser Gly Ser Asp Leu 
1 5 10 15 



Asp Arg Cys Thr Thr Phe Asp Asp Val Gin Ala Pro Asn Tyr Thr Gin 
20 25 . 30 * 



His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu He Phe Arg 
35 ■ • . 40 . ' 45 



Ser Asp Thr Leu Tyr Leu Thr Gin Asp Leu Phe Leu Pro Phe Tyr Ser 



50 



55 



60 



Asn Val Thr Gly 
65 



Phe His Thr lie Asn His Thr Phe Gly 
70 75 



Asn Pro Val 
80 



lie Pro Phe Lys 



Asp Gly He Tyr Phe Ala Ala Thr Glu 
85 90 



Lys Ser Asn 
95 



Val Val Arg Gly 
100 



Trp Val Phe Gly Ser Thr Met Asn Asn 
105 



Lys Ser Gin 
110 



Ser Val He He 
115 



He Asn Asn Ser Thr Asn Val Val He 
. 120 125 



Arg Ala Cys 



Asn Phe Glu Leu 
130 



Cys Asp Asn Pro Phe Phe Ala Val Ser 
135 140 



Lys • Pro Met 



Gly Thr Gin Thr 
145 



His Thr Met He Phe Asp Asn Ala Phe 
150 155 



Asn Cys Thr 
160 
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Phe Glu Tyr lie Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser- 
ies 170 175 



Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn "Lys Asp , Gly 
^ 180 185 190 



Phe Leu Tyr Val Tyr Lys Gly Tyr Gin Pro lie Asp Val Val Arg Asp' 
195 200 205 



Leu Pro Ser Gly Phe Asn Thr Leu Lys' Pro lie Phe Lys Leu Pro. lieu 
210 215 220 



Gly lie Asn lie Thr Asn Phe Arg Ala He Leu Thr Ala Phe Ser Pro 
225 230 • 235 240 



Ala Gin Asp He Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 
245 ' 250 . 255 . 



Leu Lys Pro Thr Thr Phe Met lieu Lys Tyr Asp Glu Asn Gly Thr He 
260 265 270 



Thr Asp Ala Val Asp Cys Ser Gin Asn -Pro Leu Ala Glu Leu Lys Cys 
275 , • 280 285 



Ser Val Lys Ser Phe Glu He Asp Lys Gly- He Tyr Gin Thr Ser Asn 
290 295 300 



Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn He Thr 
305 • ■ 310 315 320 



Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 
325 330 335 



Val Tyr Ala Trp Glu Arg Lys Lys He Ser Asn Cys Val Ala Asp Tyr 
340 345 350 



Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 
355 360 ' 365 



Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 
370 375 380 



Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gin He Ala Pro Gly 

385 390 395 400 



Gin Thr Gly Val He Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 
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405 410 415 



Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asu lie Asp Ala Thr Ser 
• 420 425..- 430 



Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 
435 440 445 



Arg Pro Phe Glu Arg Asp lie Ser Asn Val Pro Phe Ser Pro Asp Gly 
450 455 460 



Lys Pro Cys Thr Pro Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 
465 470 475 480 



Tyr Gly Phe Tyr Thr Thr Thr Gly He Gly Tyr Gin Pro Tyr Arg Val 
485 490 495 



■Val "Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 
500 505 510 



Pro Lys Led Ser Thr Asp Leu lie Lys Asn Gin Cys Val Asn Phe Asn 
515 520 525 



Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 
530 535 540 . 



Phe Gin Pro Phe Gin Gin Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 
545 550 • 555' 560 



Ser Val Arg Asp Pro liys Thr Ser Glu He Leu Asp He Ser Pro Cys 
565 570 575 



Ala Phe Gly Gly Val Ser Val He Thr Pro Gly Thr Asn Ala Ser Ser 
580 585 390 



Glu Val Ala Val Leu Tyr Gin Asp Val Asn Cys Thr Asp Val Ser Thr 
595 600 605 



AlalleHis Ala Asp Gin Leu Thr Pro Ala Trp Arg He Tyr Ser Thr 
610 615 620 



Gly Asn Asn Val Phe Gin Thr Gin Ala Gly Cys Leu He Gly Ala Glu 
625 630 635 640 



His Val Asp Thr Ser Tyr Glu Cys Asp He Pro He Gly Ala Gly He 
645 650 655 
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Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gin Lys 
• 660 665 670 



Ser He Val Ala Tyr Thr' Met Ser Leu Gly Ala Asp Ser' Ser He Ala 
675 680 685 



Tyr Ser Asn Asn Thr He Ala He Pro Thr Asn Phe Ser He Ser He 
690 695 ' 700 



Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr. Sex Val Asp Cys 
705 . 710 715 . 720 



Asn Met Tyr He Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu 

725 730 735 

Gin Tyr Gly Ser Phe Cys Thr Gin Leu Asn Arg Ala Leu Ser Gly He 
740 745- .750 



Ala Ala Glu Gin Asp. Arg Asn Thr Arg Glu Val Phe Ala Gin Val Lys 
755 760 . 765 



Gin Met Tyr- Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 
770 . 775 780 



Ser Gin He Leu Pro Asp Pro" Leu Lys Pro Thr Lys Arg Ser Phe He 
785 790 795 800 



eiu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 
805 810 815 . 



Lys Gin Tyr Gly Glu Cys Leu Gly Asp He Asn Ala Arg Asp Leu He 
820 825 830 



Cys Ala Gin Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 
•835 840- 845 



Asp Asp Met He Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 
850 855 860 



Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gin He Pro Phe 
865 870 875 880 



Ala Met Gin Met Ala Tyr Arg Phe Asn Gly He Gly Val Thr Gin Asn 
885 890 895 
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Val Leu Ty.r Glu Asn Gin -Lys Gin He Ala Asn Gin Phe Asn Lys Ala 
900 • 905 ' . 910 . 



lie Ser- Gin He Gin Glu Ser Leu Thr Thr Thr Ser Thr 'Ala Leu Gly 
915 . 920 925 



Lys Leu Gin Asp Val Val Asn Gin Asn Ala Gin Ala Leu Asn Thr Leu 
930 935 940 



Val Lys Gin Leu Ser Ser Asn Phe Gly Ala He Ser Ser Val Leu Asn 
945 950 : 955 * 960 



Asp He Leu Ser Arg Leu. Asp Lys Val Glu Ala Glu Val Gin He Asp 
965 970 975 

Arg Leu He Thr Gly Arg Leu Gin Ser Leu Gin Thr Tyr Val Thr Gin 
980 985 990 

Gin Leu He Arg Ala Ala Glu He Arg Ala Ser Ala Asn Leu Ala Ala 
995 1000 1005 

Thr Lys Met Ser Glu Cys Val Leu Gly Gin Ser Lys Arg Val Asp 
1010- 1015 1 1020 



Phe Cys Gly Lys* Gly Tyr His Leu Met Ser Phe Pro Gin Ala Ala 
. 1025 1030 1035 



Pro His Gly Val Val Phe Leu His Val Thr ' Tyr Val Pro Ser Gin 
1040 1045 1050 



Glu Arg Asn Phe Thr Thr Ala Pro Ala He Cys His Glu Gly Lys 
1055 1060 1065 



Ala Tyr Phe Pro Arg Glu Gly Val Phe Val Phe Asn Gly Thr Ser 
1070 1075 1080 



Trp Phe He Thr Gin Arg Asn Phe Phe Ser Pro Gin He He Thr 
1085 1090 . 1095 



Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val He Gly 
1100 1105 1110 



He He Asn Asn Thr Val Tyr Asp Pro Leu Gin Pro Glu Leu Asp 
1115 1120 1125 
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Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys -Asn His Thr .Ser 
1130 . . 1135 1140 ' 

Pro Asp Val Asp Leu Gly Asp .lie Ser Gly lie Asn Ala Ser Val 

• 1145 1150' 1155 

Val Asn lie Gin Lys Glu He Asp Arg Leu Asn Glu Val Ala Lys 
1160 1165 1170 

Asn Leu Asn Glu Ser Leu He Asp Leu Gin Glu Leu Gly Lys Tyr 
1175 1180 1185 

Glu Gin Tyr He Ly^ Trp Pro Trp Tyr Vai Trp Leu Gly Phe He 
1190 . 1195 1200 

Ala Gly Leu He Ala He Val Met Val Thr He Leu Leu Cys Cys 
1205 .1210 ■ 1215 

Met Thr Ser Cys Cy& Ser Cys Leu Lys Gly Ala Cys Ser Cys Gly 

• 1220 1225 . 1230 

Ser Cys Cys Lys Ph^ Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 
1235 1240 1245 

I 

Gly Val Lys Leu His Tyr Thr 
1250 • 1255 



<210> 34 

<211> 220 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 34 

Met Ala Asp Asn Gly Thr He Thr Val Glu Glu Leu Lys Gin Leu Leu 
^ 5 - 10 15 

Glu Gin Trp Asn Leu Val He GLy Phe Leu Phe Leu Ala Trp He Met 
20 .25 30 

Leu Leu Gin Phe Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyr He He 
• 35 40 45 



Lys Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe 

50 55 60 

Vai Leu Ala Ala Val Tyr Arg He Asn Trp Val Thr Gly Gly He Ala 

70 75 80 
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He Ala Met Ala Cys He Val Gly Leu Met .Trp Leu Ser Tyr Phe Val 
85 90 95 



Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn 
100 105 110 



Pro Glu Thr Asn He Leu Leu Asn Val Pro Leu Arg Gly Thr He Val 
115 120 125 



Thr Arg Pro Leu Met Glu Ser Glu Leu Val He Gly . Ala Val He He 
130 135 140 



Arg Gly His Leu Arg Met .Ala Gly His Ser Leu Gly Arg Cys Asp He 
145 150 155 160 



Lys Asp Leu Pro Lys Glu He Thr Val Ala Thr Ser Arg Thr Leu Ser 
165 170 • 175 



Tyr Tyr Lys Leu Gly Ala Ser Gin Arg Val Gly Thr Asp Ser Gly Phe 
180 185 - 190 



Ala Ala Tyr Asn Arg Tyr Arg He Gly Asn Tyr Lys Leu Asn Thr Asp 
195 200 205 



His Ala Gly Ser Asn Asp Asn He Ala Leu Leu Val 
210 215 220 



<210> 35 
<211> 76 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<4 00>' 35 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu He Val Asn Ser 
1,5 10 15 



Val Leia Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 
20 25 30 



He Leu Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn He Val Asn 
35 40 45 



Val Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 
50 55 60 
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Leu Asn Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
65. 70 . 75 . 



<210> 36 

<211> 422 ' 

<212> PRT , ' 

<213> Severe acute respiratory syndrome virus 

<400>' 36 

Met Ser Asp Asn Gly Pro Gin Ser Asn Gin Arg Ser Ala Pro Arg He 
1 5 ■ 10 15 



Thr Phe Gly Gly Pro Thr Asp Ser Thr Asp Asn Asn Gin Asn Gly Gly 

20 ■ 25 • . 30 

Arg Asn Gly Ala Arg Pro' Lys Gin Arg Arg Pro Gin Gly Leu Pro Asn 
35 40 " 45 



Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin His Gly Lys Glu Glu 
50 55 60 



Leu Arg Phe Pro Arg Gly Gin Gly Val Pro He Asn Thr Asn Ser Gly 
65 , 70 75 80 

Pro Asp Asp Gin He Gly Tyr Tyr Arg Arg Ala Thr* Arg Arg Val Arg 
. 85 . 90 95 



Gly Gly Asp Gly Lys Met Lys Glu Leu Ser Pro Arg Trp Tyr Phe Tyr 
100 105 • 110 



Tyr Leu Gly Thr Gly Pro Glu Ala Ser Leu Pro Tyr Gly Ala Asn Lys 
115 12'0 125 



Glu Gly He Val * Trp Val Ala Thr Glu Gly Ala Leu Asn Thr- Pro Lys 
130 135 140 



Asp His He Gly Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr Val Leu 
145 150 155 160 



Gin Leu - Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 
165 170 175 



Ser Arg Gly Gly Ser Gin Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg 
180 185 190 



Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn Ser Pro 
195 200 205 
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Ala Arg Met Ala Ser. Gl.y Gly Gly Glu Thr Ala Leu Ala Leu Leu Leu 
210 215 220 



Leu Asp Arg Leu Asn Gin Leu Glu Ser Lys Val Ser Gly Lys Gly Gin 
225 230 235 • 24 0 



Gin Gin Gin Gly Gin Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser 
245 • 250 255 



Lys Lys Pro Arg Gin Lys Arg Thr Ala Thr Lys Gln.Tyr Asn Val ^hr 
260 265 270 



Gin Ala Phe Gly Arg Arg Gly Pro Glu Gin Thr Gin Gly Asn Phe Giy 
275 280 285 



Asp Gin Asp Leu lie Arg Gin Gly Thr Asp Tyr Lys His Trp Pro Gin 
290 295 300 



lie Ala Gin Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg 
305 310 315 320 



lie Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr His. Gly 
325 330 335 



Ala He Lys Leu Asp Asp Lys Asp Pro Gin Phe Lys Asp Asn Val He 
340 345 350 ' 



J^BU Leu Asn Lys His He Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu- 
355 360 365 



Pro Lys Lys Asp Lys Lys.. Lys Lys Thr Asp Glu Ala Gin Pro Leu Pro 
370 375 380 



Gin Arg Gin Lys Lys Gin Pro Thr Val Thr Leu Leu Pro Ala Al^ Asp 
385 '390 395 400 



Met Asp Asp Phe Ser Arg Gin Leu Gin Asn Ser Met Ser Gly Ala Ser 
405 410 415 • 



Ala Asp Ser Thr Gin Ala 
420 



<210> 37 
<211> 230 
<212> PRT 
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<213> Bovine corona virus 
<400> 37 . ■ 

Met Ser Ser Val Thr Thr Pro Ala Pro Val Tyr Thr Trp Thr Ala Asp 
1 ' • • 5 ■ 10- '15 



Glu Ala He Lys Phe Leu Lys Glu Trp Asn Phe Ser Leu Gly lle He 
20 25 30 



Leu Leu Phe lie Thr Val He Leu Gin Phe Gly Tyr Thr Ser Arg Ser 
35 40 45 



Met Phe Val Tyr Val He Lys Met Val He Leu Trp Leu Met Trp Pro 
50 55 • 60 ' 



Leu Thr lie He Leu Thr He Phe Asn Cys Val Tyr Ala Leu Asn Asn 
65 . 70 . 75 • 80 



Val Tyr Leu Gly Phe Ser He Val Phe Thr He Val Ala He He Met 
85 90 95 



Trp lie Val Tyr Phe Val Asn Ser He Arg- L^u Phe lie Arg Thr Gly 
100 105 . 110 



Ser Trp Trp Ser Phe Asn Pro Glu Thr Asn Asn Leu Met Cys He Asp 
115 120 125 



Met Lys Gly Arg Met Tyr Val Arg Pro He He Glu Asp Tyr His Thr 
X30 135 140 



Leu Thr Val Thr He He Arg Gly His Leu Tyr Met Gin Gly He Lys 
145 ■ 150 165 160 



Leu Gly Thr Gly Tyr Ser Leu Ser Asp Leu Pro Ala Tyr Val Thr Val 
165 170 175 



Ala Lys Val Ser His Leu Leu Thr Tyr Lys Arg Gly Phe Leu Asp Lys 
180 185 190 



He Gly Asp Thr Ser Gly Phe Ala Val Tyr Val Lys Ser Lys Val Gly 
195 200 205 



Asn Tyr Arg Leu Pro Ser Thr Gin Lys Gly Ser Gly Leu Asp Thr Ala 

210 215 220 



Leu Leu Arg Asn Asn He 
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225 230 



<210> 38 • . . . 

<211> 226 

<212> PRT 

<213> Avian infectious ' bronchitis virus 

<400> 38 

Met Ser Asn Gly Thr Glu Asn Cys Thr Leu Ser Thr Gin Gin Ala Ala 
1 • 5 • 10 . 15 



Glu Leu Phe Lys Glu Tyr Asu Leu Phe lie Thr Ala . Phe Leu Leu Phe 
20 25 30 



Leu Thr lie Leu Leu Gin Tyr Gly Tyr Ala Thr Arg Ser Arg Phe lie 
35 4 0 ' 45 * 



Tyr He Leu Lys Met lie Val Leu Trp Cys Phe Trp Pro Leu Asn He 
50 55 . • ■ 60 ■ 



• Ala Val Gly He He. Ser Cys He Tyr Pro Pro Asn Thr Gly Gly Leu 
65 70 75 . "80 

Val Ala Ala He He Leu Thr Val Phe Ala Cys Leu Ser Phe Val, Gly 
r .85 90 '95 



Tyr Trp lie Gin Ser Phe Arg Leu Phe Lys Arg Cys Arg Ser Trp Trp 
100 105 110 



ger Phe Asn Pro Glu Ser Asn Ala Val Gly Ser He Leu Leu Thr Asn- 

120 . • 125 



Gly Gin Gin Cys Asn Phe. Ala lie Glu Ser Val Pro Met Val Leu Ser 
130 135 140 



Pro He He Lys Asn Giy Ala Leu Tyr Cys Glu Gly Gin Trp Leu Ala 
145 150 155 160 



Lys Cys Glu Pro Asp His Leu Pro Lys Asp He Phe Val Cys Thr Pro 
165 170 175 



Asp Arg Arg Asn He Tyr Arg Met Val Gin Lys Tyr Thr Gly Asp Gin 
180 185 190 



Ser Gly Asn Lys Lys Arg Phe Ala Thr Phe Val Tyr Ala Lys Gin Ser 
195 200 205 
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Val Asp Thr Gly Glu Leu Gly Ser Val Ala- Thr Gly Gly Ser Ser Leu 
210 215 220 



Tyr Thr 
225 



<210> 39 • . . 

<211> 262 
<212> PRT 

<213> Transmissible gastroenteritis virus 
<400> ..39 

Met Lys lie Leu Leu He Leu Ala Cys Val lie Ala Cys Ala Cys Gly 
1 5 * ■ 10 ' 15 



Glu Arg Tyr Cys Ala Met Lys Ser Asp Thr Asp Leu Ser Cys Arg Asn 

20 25 . 30 ■ . 



Ser Thr Ala Ser Asp Cys Glu Ser Cys Phe Asn Gly= Gly Asp Leu lie 
, 35 . 40 45 



Trp His Leu Ala Asn Trp Asn Phe Ser 'Trp Ser He He Leu He Val 
50 55 • • 60 

Phe He Thr Val Leu Gin Tyr Gly Arg Pro Gin Phe Ser Trp Phe Val 
65 70 75 80 . 

Tyr Gly He Lys Met Leu He Met Trp Leu Leu Trp Pro Val Val Leu 

85 90 95 

Ala Leu Thr He" Phe Asn Ala Tyr Ser Glu Tyr Gin Val Ser Arg Tyr 
100 105 110 



Val Met Phe Gly Phe Ser He Ala Gly Ala He Val Thr Phe Val Leu 
115 120 125 



Trp He Met Tyr Phe Val Arg Ser He Gin Leu Tyr Arg Arg Thr Lys 
130 135 140 



Ser Trp Trp Ser Phe Asn Pro Glu Thr Lys Ala He Leu Cys Val Ser 
145 150 155 160 



Ala Leu Gly Arg Ser Tyr Val Leu Pro Leu Glu Gly Val Pro Thr Gly 
165 170 175 

Val Thr Leu Thr Leu Leu Ser Gly Asn Leu Tyr Ala Glu Gly Phe Lys 
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180 185 190 

He Ala Gly Gly Met Asm. lie Asp Asn Levi Pro Lys Tyr Val Met Val 
195 200 205 

Ala Leu Pro Ser Arg Thr lie Val Tyr Thr Leu Val Gly Lys Lys Leu 
210 215 220 ' 



Lys Ala Ser Ser Ala Thr Gly Trp Ala Tyr Tyr Val Lys Ser Lys Ala 
225 230 235 . 240 

Gly Asp Tyr Ser Thr Glu Ala Arg Thr Asp Asn Leu Ser Glu Gin Glu 
245 250 255 

Lys Leu Leu His Met Val* 
260 



<210> 40 ■ 

<211> 263 

<212> PRT 

<213> feline ' coronavirus 

<400> 40 . 

Met Lys He Leu Leu He Leu Ala Cys Ala Val Ala Cys Val Tyr Gly 

1 5 10 15 • 



Glu Gin lie Arg Tyr Cys Ala Met Gin Glu Thr Gly Leu Ser Cys Arg 
20 25 30 



Asn Gly Thr Ala Ser Asp Cys Glu Ser Cys Phe Asn Gly Gly Asp Leu 

• • 35 40 ■ -45 

He Trp His Leu Ala Asn Trp Asn Phe Ser Trp Ser He He Leu He 
50 55 60 



Val Phe He Thr Val Leu Gin Tyr Gly Arg Pro Gin Phe Ser Trp Phe 

65 70 75 ■ 80 

Val Tyr Gly He Lys Met Leu He Met Trp Leu Leu Trp Pro He Val 
85 90 95 



Leu Ala Leu Thr He Phe Asn Ala Tyr Ser Glu Tyr Glu Val Ser Arg 
100 105 110 



Tyr Val Met Phe Gly Phe Ser Val Ala Gly Ala Val Val Thr Phe Ala 
115 120 125 
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Leu Trp Met Met Tyr- Phe Val Arg Ser He Gin- Leu Tyr hrg Arg Thr 
130 135 . . 140 

Lys Ser Trp Trp Ser Phe Asn Pro Glu Thr Asn Ala He Leu Cys Val 
145 150 155 160 

Asn Ala Leu Gly Arg Ser Tyr Val Leu Pro -Leu Asp Gly Thr Pro Thr 
165 170 ' 175 

Gly Val Thr Leu Thr, Leu Leu Ser Gly Asn Leu Tyr Ala Glu Gly Phe 
180 185 190 . ' 

Lys Met Ala Gly Gly Leu Thr He Glu His Leu Pro Lys Tyr Val Met 
195 200 205 

He Arg Thr Pro Asn Arg Thr lie Val Tyr. Thr Leu Val Gly Lys Gin 
210 215 220 

Leu Lys Ala Thr Thr Ala Thr Gly Trp Ala, Tyr Tyr Val Lys Ser Lys 
•225 230 235 240 

Ala Gly Asp Tyr Ser. Thr Glu Ala Arg Thr Asp Asn Leu Ser Glu His 
245 * 250 • 255 

s , 

Glu Lys Leu Leu His Met Val 

■ 260 

<210> 41 
<:211> 231 

<212> PRT . ' ■ 

<213> Human coronavirus OC43 

MSSKTTPAPVYIWTADEAIKFLKEWNFSLGIILLFITIILQFGYTSRSMFVYVIKMIILWLMWPLTIILTIFNCVY 

ALWNVYLGLSIVFTIVAIIMWIVYFVNSIRLFIRTGSFWSFNPETNNLMCIDMKGTMYVRPIIEDYHTLTVTIIRG 

HLYIQGIKLGTGYSWADLPAYMTVAKVTHLCTYKRGFLDRISDTSGFAVYVKSKVGNYRLPSTQKGSGMDTALLRN 
NX 

<SEQ ID NO: 37; prt; Porcine hemagglutinating encephalomyelitis virus 
<400> 41 . 

Met Ser Ser Pro Thr Thr Pro Val Pro Val He Ser Trp Thr Ala Asp 
1 . 5 ■ 10 15 ' 

Glu Ala He Lys Phe Leu Lys Glu Trp Asn Phe Ser Leu Gly He He 
20 25 30 



Val Leu Phe He Thr He He Leu Gin Phe Gly Tyr Thr Ser Arg Ser 
35 40 45 
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Met Phe Val Tyr Val lie Lys Met Val lie Leu Trp Lfeu Met Trp Pro 
50 55. • - 60 . 



Leu Thr He He Leu Thr He Phe Asn Cys Val Tyr Ala Leu Asn Asn 
65 .70 75 80 



Val Tyr Leu Gly Phe Ser He Val Phe Thr He Val Ala He He Met 
85 90 . 95 



Trp Val Val Tyr Phe Val Asn Ser He Arg Leu Phe He Arg Thr Gly 
100 105 110 



Ser Trp Trp Ser Phe Asn Pro- Glu Thr Asn Asn Leu Met Cys He Asp 
115 120 , 125 



Met Lys Gly Arg Met Tyr Val Arg Pro He He Glu Asp Tyr His Thr 
130 135 140 



Leu Thr Ala Thr He He Arg Gly Eis Leu Tyr He Gin Gly He Lys 
145 150 155 160 



Leu Gly Thr Gly Tyr Ser Leu Ser Asp Leu Pro Ala Tyr Val Thr Val 
•165 ' 1 170 * 175 



Ala Lys Val Thr His Leu Cys Thr Tyr Lys Arg Gly Phe Leu Asp Arg 
180 185 190 



ile Gly Asp Thr Ser Gly Phe ' Ala. Val • Tyr Val Lys Ser Lys Val Gly 
195 200 205 



Asn Tyr Arg Leu Pro Ser Thr His Lys Gly Ser Gly Met Asp Thr Ala 
210 215 220 



Leu Leu Arg Asn Asn He Met 



225 




230 


<210> 


42 




<211> 


223 




<212> 


PRT 




<213> 


Avian 


infectious 


<400> 


42 





Met Met Glu Asn Cys Thr Leu Asn Leu Glu Gin Ala Thr Leu Leu Phe 
15 10 15 



Lys Glu Tyr Asn Leu Phe He Thr Ala Phe Leu Leu Phe Leu Thr lie 
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20 



25 



30 



Leu Leu Gin Tyr Gly Tyr Ala Thr 2Vrg Ser Arg Phe lie Tyr He Leu 

35 ' ' . 40 . • • 45 ■ . 



Lys Met He Val Leu Trp Cys Phe Trp Pro Leu Asn He Ala Val Gly 
50 55 60 



Val He Ser Cys He Tyr Pro Pro Asn Thr Gly Gly Leu Val Ala Ala 
65 70 75 .go 



He He Leu Thr Val Phe Ala Cys Leu Ser Phe Val Gly Tyr Trp He 
85 '90 .95 



Gin Ser Cys Arg Leu Phe Lys Arg Cys Arg Ser Trp Trp Ser Phe Asn 
100 105 110 



Pro Glu Ser Asn Ala Val Gly Ser He -Leu Leu Thr- Asn Gly Gin Gin' 
115 120 ' 125 



Cys Asn Phe Ala He Glu Ser Val Pro Met Val Leu Ala Pro He He 
130 135 • ■ 140 



Lys .^\sn Gly Val Leu Tyr Cys Glu Gly Gin Trp Leu Alei Lys Cys Glu 

145 150 -155 " 160. 

Pro Asp His Leu Pro Lys Asp He Phe Val Cys Thr Pro Asp Arg Arg 
165 170 175 



Asn He Tyr Arg Met Val Gin Lys Tyr Thr Gly Asp Gin Ser Gly Asn 
180 185 190 



Lys Lys Arg Val Ala Thr Phe Val Tyr Ala Lys Gin Ser Val Asp Thr 
195 200 205 



Gly Glu Leu Glu Ser Val Pro Thr Gly Gly Ser Ser Leu Tyr Thr 
210 215 220 



<210> 4 3 

<211> 455 

<212> PRT 

<213> Mouse Hepatitis Virus 

<400> 43 

Met Ser Phe Val Pro Gly Gin Glu Asn Ala Gly Ser Arg Ser Ser Ser 
1 5 10 • 15 
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Val Asn Arg Ala Gly Asn Gly He Leu Lys Lys Thr Thr Trp Ala Asp 
20 * 25 ' • . 30 



Gin Thr Giu Arg Gly Pro Asn Asn Gin Asn Arg Gly Arg Arg Asn Gin 
35 / 40 ' 45 



Pro Lys Gin Thr Ala Thr Thr Gin Pro Asn Ser Gly Ser Val Val Pro 
50 55 60 



His Tyr Ser Trp Phe Ser Gly He Thr Gin Phe Gin Lys Gly Lys Glu 
65 70 - .75 80 



Phe Gin Phe Ala Gin Gly. Gin Gly Val Pro He Ala Asn Gly He Pro 
85 90 95 



Ala Ser Glu Gin Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe 
' 100 105 110 



Lys Thr. Pro Asp Gly Gin Gin Lys Gin Leu Leu Pro Arg Trp Tyr Phe 
115 120 125 • 



Tyr Tyr Leu Gly Thr Gly Pr.o His Ala Gly Ala Glu Tyr Gly Asp Asp 
130 135 » 140 



He Asp Gly Val Val Trp Val Ala Ser Gin Gin Ala Asp Thr Lys Thr 
145 150 155 • , 160 



Thr Ala Asp He Val Glu Arg Asp- Pro Ser Ser His Glu Ala He Pro 
165 170 175 



Thr Arg Phe Ala Pro Gly Thr Val Leu Pro Gin Gly Phe Tyr Val Glu 
180 185 190 



Gly Ser Gly Arg Ser Ala jPro Ala Ser Arg Ser Gly Ser Arg Ser Gin 
195 200 205 



Ser Arg Gly Pro Asn Asn Arg Ala Arg Ser Ser Ser Asn Gin Arg Gin 
210 215 220 



Pro Ala Ser Thr Val Lys Pro Asp Met Ala Glu Glu He Ala Ala Leu 
225 230 235 240 



Val Leu Ala Lys Leu Gly Lys Asp Ala Gly Gin Pro Lys Gin Val Thr 
245 . 250 255 
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Lys Gin Ser Ala Lys Glu Val Arg Gin Lys lie Leu Asn Lys Pro Arg 
. 260 265 • ' 270 



Gin Lys Arg- Thr Pro Asn Lys Gin Cys .Pre Val Gin Gin Cys Phe Gly 
- 275 . 280 285 



Lys Arg Gly Pro Asn Gin Asn Phe Gly Gly Ser Glu Met Leu Lys Leu 
290 295 300 



Gly Thr Ser Asp Pro Gin Phe Pro lie Leu Ala .Glu Leu Ala Pro Thr 

305 310 315 320 

Pro Ser Ala Phe Phe Phe. Gly Ser Lys Leu Glu Leu Val Lys Lys Asn 

325 ' ' 330 335 



Ser Gly Gly Ala Asp Asp Pro Thr- Lys Asp Val Tyr Glu Leu Gin Tyr 
■ 340 345 350 

Ser Gly Ala lie Arg Phe Asp Ser Thr Leu Pro Gly Phe Glu Thr lie 
355 360 365 

Met Lys Val Leu Asn Glu Asn Leu Asp Ala Tyr Gin Asp Gin Ala Gly * 
370 375 ; 380 



Gly Ala Asp Val Val Ser Pro Lys Pro Gin Arg Lys Arg Gly Thr Lys 
385- 390 395 400 

Gin Lys Ala Leu Lys Gly Glu Val Asp Asn Val -Ser Val Ala Lys ^ro 
405 410 415 



Lys Ser Ser Val Gin Arg Asn Val Ser Arg Glu Leu Thr Pro Glu Asp 
420 425 430 



Arg Ser Leu Leu Ala Gin lie Leu Asp Asp Gly Val Val Pro Asp Gly 
435 440 445 



Leu Glu Asp Asp Ser Asn Val 

450 455 ' 



<210> 44 

<211> 448 , - 

<212> PRT 

<213> Bovine corona virus 

<400> 44 

Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 
^ 5 10 15 
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Asn Arg Ser Gly Asn Gly He Leii Lys Trp Ala Asp Gin Ser Asp Gin 
• 20 25 30 



Ser Arg Asn Val Gin Thr* Arg Gly Arg Arg Ala Gin Pro Lys Gin Thr 
35 40 45 



Ala Thr Ser Gin Gin Pro Ser Gly Gly Asn Val Val Pro Tyr Tyr Ser 
50 55 60 . 



Trp Phe Ser Gly He Thr Gin Phe Gin Lys Gly Lys.Glu Phe Gin Phe 
65 70 75 80 

Ala Glu Gly Gin Gly Val Pro He Ala Pro Gly Val Pro Ala Thr Glu 
85 90 95 



Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe Lys Thr Ala 
100 105 ; 110 



Asp Gly Asn Gin Arg Gin Leu Leu Pro Arg Trp Tyr Phe Tyr Tyr Leu 
115 120 125 



Gly Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp He Asp Gly 
130 135 140 



Val Tyr Trp Val Ala Ser Asn Gin Ala Asp Val Asn Thr Pro Ala Asp 
145 150 155 160 



lie Leu Asp Arg Asp Pro Ser Ser Asp Glu Ala He Pro Thr Arg Phe- 
165 170 175 . 



Pro Pro Gly Thr Val Leu Pro Gin Gly Tyr Tyr He Glu Gly Ser Gly 
180 185 190 



Arg Ser Ala Pro Asn Ser Arg Ser Thr Ser Arg Ala Ser Ser Arg Ala 
195 200 205 



Ser Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Pro 
210 215 220 



Thr Ser Gly Val Thr Pro Asp Met Ala Asp Gin He Ala Ser Leu Val 
225 230 235 240 



Leu Ala Lys Leu Gly Lys Asp Ala Ala Lys Pro Gin Gin Val Thr Lys 
245 250 255 
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Gin Thr Ala Lys Glu lie Arg Gin Lys lie Leu Asn Lys Pro Arg Gin 
260 ■ ' 265 270 



Lys ArgSer Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
275 .* 280 285 



Arg Gly Pro Asn Gin Asn Phe Gly Gly Gly Glu Met Leu Lys Leu Gly 
290 295 . 300 



Thr Ser Asp Pro Gin Phe Pro lie Leu Ala Glu Leu Ala Pro Thr Ala 
305 310 315 320 



Gly Ala Phe Phe Phe Gly. Ser- Arg Leu Glu Leu Ala Lys Val Gin Asn 
325 330 335 



Leu Ser Gly Asn Leu Asp Glu Pro Gin Lys Asp Val Tyr Glu Leu Arg 
340 345 350 



Tyr Asn Gly Ala lie Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 360 365 



lie Met Lys Val Leu Asn Glu Asn Leu Asn Ala Tyr Gin Gin Gin Asp 
370 . 375 » 380 



Gly Thr Met Asn Met Ser Pro Lys Pro Gin Arg Gin Arg Gly Gin Lys 
385 390 395 400 



Asn Gly Gin Gly Glu Asn Asp Asn . lie Ser Val Ala Ala Pro Lys Ser 
405 410 415 



Arg Val Gin Gin Asn Lys lie Arg Glu Leu Thr Ala Glu- Asp lie Ser 
420 425 430 



Leu Leu Lys Lys Met Asp Glu Pro Phe Thr Glu Asp Thr Ser Glu lie 
4 35 -440 4 45 . 



<210> 45 
<211> 409 
<212> PRT 

<213> Avian infectious bronchitis virus 
<400> 45 

Met Ala Ser Gly Lys Ala Ala Gly Lys Thr Asp Ala Pro Ala Pro Val 
15 10 15 



lie Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Ser Gly Asn 
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20 



25 



30 



Ala Ser Trp Phe Gin Ala. Leu Lys Ala Lys Lys Leu Asn Ala Pro Ala 
35 40 • . 45 



Pro Lys Phe Glu Gly Ser Giy Val Pro Asp Asn Glu Asn hen Lys - lie 
50 55 60 . ' 



Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Tyr Lys Pro Gly 
^5 70 75 .80 



Lys Gly Gly Arg Lys Pro Val Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 
85 ■ 90 95 



Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Ser Gin Asp Gly 
100 105 110 



lie Val Trp Val Ala Ala Lys Gly Ala -Asp Val Lys Ser .Arg Ser Asn' 
115 120 125 



Gin Gly Thr Arg Asp Pro Asp Lys Phe Asp Gin Tyr Pro Leu Arg Phe 
130' 135 • 140 



Ser ^^sp Gly Gly Pro Asp Gly Asn Phe Arg Trp Asp" Phe' lie Pro Leu 
145 150 155 160 



Asn Arg Gly Arg Ser Gly Arg Ser Thr Ala Ala Ser Ser Ala Ala Ser 
165 170 175 



Ser Arg Ala Pro Ser Arg Glu Gly Ser Arg Gly Arg Leu Asn Gly Ala 
• 180 lp5 190 



Glu Asp Asp Leu lie Ala Arg Ala Ala Lys lie He Gin Asp Gin Gin 
195 200 205 



Lys Lys Gly Ser Arg He Thr Lys Ala Lys Ala Glu Glu Met He His 
210 215 220 



Arg Arg Tyr Cys Lys Arg Thr Val Pro Pro Gly Val ' Ser He Asp Lys 
225 230 • 235 240 



Val Phe Gly Pro Arg Thr Lys Gly Lys Glu Gly Asn Phe Gly Asp Asp 
245 250 255 



Lys Met Asn Glu Glu Gly He Lys Asp Gly Arg Val Thr Ala Met Leu 
260 265 270 
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Asn Leu Val Pro Ser Ser His Ala Cys Leu Phe Gly ,Ser Gin Val Thr 
■ 275 280 285 



Pro Lys Leu Gin Pro Asp Gly Leu His Leu Thr Phe Arg Phe Thr Thr 
290 . 295 300 



Val Val Ser Arg Asp Asp Pro Gin Phe Asp Asn Tyr Val Lys lie Cys 
305 310 • 315 320 



Asp- Glu Cys Val Asp Gly Val Gly Thr Arg pro Lys Asp Glu Val Val 
325 330 335 



Arg Pro Lys Ser Arg Ser Ser Ser Arg Pro Ala Thr Arg Gly Thr Ser 
340 ' 345 350 



Pro Ala Pro Lys Gin Gin Arg Pro Lys Lys Glu Lys Lys Pro Lys Lys 
355 360 365 



Gin Asp Asp Glu Val Asp Lys Ala Leu Thr Ser Asp Glu Glu Arg Asn 
370 ' 375 380 



Asn Ala Gin Leu Glu Phe Asp Asp Glu Pro Lys Val He Asn Trp Gly 
385 390 395 ' 400 



Asp Ser Ala Leu Gly Glu Asn Glu Leu 
405 



46 
376 
PRT 

Feline coronavirus 
<400> 46 ' 

Met Ala Thr Gin Gly- Gin Arg Val Asn Trp Gly Asp Glu Pro Ser Lys 
1 .5 * 10 ' 15 



Arg Arg Gly Arg Ser Asn Ser Arg Gly Arg Lys Asn Asn Asp He Pro 
20 25 30 



<210> 
<211> 
<212> 
<213> 



Leu Ser Tyr Phe Asn Pro He Thr Leu Asp Gin Gly Ser Lys Phe Trp 
35 40 45 



Asn Leu- Cys Pro Arg Asp Phe Val Pro Lys Gly lie Gly Asn Lys Asp 
50 55 60 
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Gin Gin lie Gly Tyr Trp Ash Arg Gin. Ala Arg Tyr Arg lie Val Lys 

65 ' • ' 75 • • 80 

Gly Gin Arg Val Glu Leu Pro Glu Arg Trp Phe Phe Tyr-.Phe Leu Gly 
85 90 95 



Thr Gly Pro His Ala Asp Ala Lys Phe Lys Ala Lys He Asp Gly Val' 
• 10b • . 105 . 110 

Phe Trp Val Ala Arg Asp Gly Ala Met ' Asn Lys Pro • Thr Ser Leu .Gly 
115 120 125' 

Thr Arg Gly Thr Ash Asn Glu Ser Lys Pro Leu Lys Phe Asp Gly Lys 

130 • 135 140 



He Pro Pro Gin Phe Gin Leu Glu Val Asn Arg Ser Arg Asn Asn Ser 

145 . 15b 155 160 

Arg Ser Gly Ser Gin Ser Arg Ser Val Ser Arg Asn Arg Ser Gin Ser 

165 110 175 



Arg Gly Arg Gin Gin Ser Asn Asn Gin Asn Thr Asn Val Glu Asp Thr 
180 185 190 *. 



He Val Ala Val Leu Gin Lys Leu Gly Val Thr Asp- Lys Gin Arg Ser 
195 200 ' • 205 



Arg Ser Lys Ser Gly Glu Arg Ser Gin Sex Lys Ser Arg Asp Thr Thr 
210 215 220 



Pro Lys Asn Ala Asn Lys His Thr Trp Lys Lys Thr Ala Gly Lys Gly 
225 230 235 '240 



Asp Val Thr Asn Phe Tyr Gly Ala Arg Ser Ser Ser Ala Asn Phe Gly 
245 250 255 

Asp Ser Asp Leu Val Ala Asn Gly Asn Ala Ala Lys Cys Tyr Pro Gin 
260 265 270 

He Ala Glu Cys Val Pro Ser Val Ser Ser He Leu Phe Gly Ser Gin 
275 280 285 

Trp Ser Ala Glu Glu' Ala Gly Asp Gin Val Lys Val Thr Leu Thr His 
290 295 300 



Asn Tyr Tyr Leu Pro Lys Asp Asp Ala Lys Thr Ser Gin Phe Leu Glu 
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305 310 315 320 



Gin He Asp Ala Tyr Lys Arg Pro Ser Glu Val Ala Lys Asp Qlr\ Arg . 

325 3.30 335 



Gin Arg Lys Ser Arg Ser Lys Ser Ala Asp Lys Lys Pro Glu Glu- Leu 
340 345 350 



Ser Val Thr Leu Glu Ala Tyr Thr Asp Val Phe Asp Asp Thr Gin Val 
. . 355 360 365 . ' 



Glu Met He Asp Glu Val Thr Asn 
37 0 375 



<210> 47 

<211> 382 

<212> PRT 

<213> porcine transmissible gastroenteritis virus 

<400> 47 . • 

■ Met Ala Asn Gin Gly, Gin Arg Val Ser Trp Gly Asp Glu Ser Thr Lys 
1 5 " 10 ' . 15 



Thr Arg Gly Arg Ser Asn Ser Arg Gly Arg Lys Asn Asn Asn He Pro 
20 ■ • 25- • 30 



Leu Ser Phe Phe Asn Pro lie Thr Leu Gin Gin Gly Ser Lys Phe Tfp 
35 40 45 



Asn Leu Cys Pro Arg Asp Phe Val Pro Lys Gly He Gly Asn Arg Asp 
50 55 60 



Gin Gin He Gly Tyr Trp. Asn Arg Gin Thr Arg Tyr Arg Met Val Lys 
65 70 75 80 



Gly Gin Arg Lys Glu Leu Pro Glu Arg Trp Phe Phe Tyr Tyr Leu Gly 
85 .90 95 



Thr Gly Pro His Ala Asp Ala Lys Phe Lys Asp Lys Leu Asp Gly Val 
100 105 110 



Val Trp Val Ala Lys Asp Gly Ala Met Asn Lys Pro Thr Thr Leu Gly 
115 120 125 



Ser Arg Gly Ala Asn Asn Glu Ser Lys Ala Leu Lys Phe Asp Gly Lys 
130 135 140 



78 



wo 2004/096842 



PCT/CA2004/000626 



Val Pro Gly Glu Phe Gin Leu Glu Val Asn Gin Ser Arg Asp Asn Ser 
145 150, .155 160 



Arg Leu- Arg Ser Gin Ser Arg Ser Arg Ser Arg Asn Arg Ser Gin Ser 
165 . 170 175 • 



Arg Gly Arg Gin Gin Ser Asn Asn Lys Lys Asp Asp Ser Val Glu Gin 
180 185 190 



Ala Val Leu Ala Ala Leu Lys Lys Leu Gly Val Tyr Thr Glu Lys Gin 
195 200 205 



Gin Gin Arg Ser Arg Ser, Lys* Ser Lys Glu Arg Ser Asn Ser Lys lie 
210 215 220 



Arg Asp Thr Thr Pro Lys Asn Glu Asn Lys His Thr Trp Lys Arg Thr 
225 230 . 235 240 



Ala Gly Lys Gly Asp Val Thr P^g Phe Tyr Gly Thr Arg Ser Asn Ser 
245 250 255 



Ala Asn Phe Gly Asp Ser Asp Leu Val Ala Asn Gly Ser Ser Ala Lys 
■ 260 • 265 .270 



His Tyr Pro Gin Leu Ala Glu Cys Val Pro Ser Val Ser Ser He Leu 
■275 280 285 



Phe Gly Ser Tyr Trp Thr Ser Lys . Glu Asp Gly Asp Gin He Glu Val 
290 2-95 300 



Thr Phe Thr His Lys Tyr His Leu Pro Lys Asp Asp Pro Lys Thr Gly 
305 ^ 310 315 320 



Gin Phe Leu Gin Gin He Asn Ala Tyr 'Ala Arg Pro Ser Glu Val Ala 
325 330 335 



Lys Glu Gin Arg Lys Arg Lys Ser Arg Ser Lys Ser Ala Glu Arg Ser 
340 345 350 



Glu Gin Glu Val Val Pro Asp Ala Leu He Glu Asn Tyr Thr Asp Val 
355 360 365 



Phe Asp Asp Thr Gin Val Glu Met He Asp Glu Val Thr Asn 
370 375 380 
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<210> 48 • . 

<211>' 389 
<212> PRT 

<213> Human cpronavirus 22 9E 
<400> 48 

Met Ala Thr Val Lys Trp Ala Asp Ala Ser Glu Pro Gin Arg Gly Arg 
^5 10 15 * 

Gin Gly Arg lie Pro Tyr Ser Leu Tyr Ser pro Leu Leu Val Asp Ser 
20 25 ■ -30 . • 

Glu Gin Pro Trp Lys Val lie Pro Arg Asn Leu Val Pro lie Asn Lvs 
35 40 • 45 

Lys Asp Lys Asn Lys Leu lie Gly Tyr Trp Asn Val Gin Lys Arg Phe 
50 55 60 

Arg Thr Arg Lys Gly Lys Arg Val Asp Leu Ser Pro Lys Leu His Phe 
65 70 75 " • . . 80 

Tyr Tyr Leu Gly Thr Gly Pro His Lys Asp Ala Lys Phe Arg Glu Arg 
85 .90 95 

Val Gin Gly Val. Val '?rp Val Ala Val "Asp Gly Ala Lys Thr Glu Pro 
100 ■ 105 110 

Thr Gly Tyr Gly Val Arg Arg Lys Asn Ser Glu Pro Glu He Pro His 
115 120 125 

Phe Asn Gin Lys Leu Pro Asn Gly Val Thr Val Val Glu Glu Pro Asp 
130 135 ' 140 

Ser Arg Ala Pro Ser Arg Ser Gin Ser Arg Ser Gin Ser Arg Gly Ara 
145- 150 155 160 

Gly Glu Ser Lys Pro Gin Ser Arg Asn Pro Ser Ser Asp Arg Asn His' 
165 170 175 

Asn Ser Gin Asp Asp He' Met Lys Ala Val Ala Ala Ala Leu Lys Ser 
180 185 190 

Leu Gly Phe Asp Lys Pro Gin Glu Lys Asp Lys Lys Ser Ala Lys Thr 
195 200 ,205 

Gly Thr Pro Lys Pro Ser Arg Asn Gin Ser Pro Ala Ser Ser Gin Thr 
210 215 220 
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Sex Ala Lys Bar Leu Ala Arg Ser Gin Ser Ser Glu.Thr Lys Glu Gin 
225 ' 230 235 ' 240 



Lys His Glu Met Gin Lys Pro Arg Trp Lys Arg Gin Pro Asn Asp Asp 
245 . 250 255 



Val Thr Ser Asn Val Thr Gin Cys Phe Gly Pro Arg Asp Leu Asp His 
260 265 270 



Asn Phe Gly Ser Ala Gly Val Val Ala Asn Gly Val Lys Ala Lys Gly 
275 280- 285 



Tyr Pro Gin Phe Ala Glu Leu Val Pro Ser Thr Ala Ala Met Leu Phe 
290 ' 295 300 



Asp Ser His lie Val Ser Lys Glu Ser Gly Asn Thr Val Val Leu Thr 
305 310 ' 315 320 



Phe Thr Thr Arg Val Thr Val Pro Lys Asp His Pro His Leu Gly Lys 
325 * • 330 ■ 335 



Phe Leu Glu Glu Leu Asn Ala Phe Thr Arg Glu Met Gin Gin His Pro 
340 *' 345 350 



Leu Leu Asn Pro Ser Ala Leu Glu Phe Asn Pro Ser Gin Thr Ser- Pro 
355 " 360 365 



Ala Thr Ala Glu Pro Val Arg Asp Glu Val Ser lie. Glu Thr Asp lie 
. 370 375 380 • 



lie Asp Glu Val Asn 
385 



.<210> 49 - 

<211> 448 

<212> PRT 

<213> Human coronavirus 

<400> 49 

Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 
1 5-. 10 15 



Asn Arg Ser Gly Asn Gly lie Leu Lys Trp Ala Asp Gin Ser Asp Gin 
20 25 30 



81 



wo 2004/096842 



PCT/CA2004/000626 



Val Arg Asn Val Gin Thr Arg Gly Arg Arg Ala Gin Pro Lys Gin Thr 
35 • . ' . 40 • ■ • 45 • 



Ala Thr Ser Gin Gin Pro Ser Gly Gly Asn Val Val Pro 'Tyr Tyr Ser 
50 55 60 



Trp Phe Ser Gly He Thr Gin Phe Gin Lys Gly Lys Glu Phe Glu Phe' 
65 70 . 75 . 80 

Val Glu Gly Gin Gly Pro Pro. lie Ala -Pro Gly Val Pro Ala Thr Glu 
85 90 95 

Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Gly Ser Phe Lys Thr Ala 
100 105 110 

Asp Gly Asn Gin Arg Gin Leu Leu Pro Arg Trp Tyr Phe Tyr Tyr Leu 
115 120 125 ■ , 

Gly Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp lie Asp Gly 
130 135 140 

Val Tyr Trp Val Ala Ser Asn Gin Ala Asp Val Asn Thr Pro Ala Asp 
145 ■ .150 - ^ .155 • 160 

Zle Val Asp Arg Asp Pro Ser Ser Asp Glu Ala He Pro Thr Arg Phe 
165 170 , 175 

Pro Pro Gly Thr Val Leu Pro Gin Gly Tyr Tyr lie Glu Gly Ser Gly 
180 185 190 



Arg Ser Ala Pro Asn Ser Arg Ser Thr Ser Arg Thr Ser Ser Arg Ala 
195 -200 205 



Ser. Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Pro 
210 215 220 



Thr Ser Gly. Val Thr Pro Asp Met Ala Asp Gin He Ala Ser Leu Val 
225 230 235 240 

Leu Ala Lys Leu Gly Lys Asp Ala Thr Lys Pro Gin Gin Val Thr Lys 
245 250 255 



His Thr Ala Lys Glu Val Arg Gin Lys He Leu Asn Lys Pro Arg Gin 
260 265 270 



Lys Arg Ser Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
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275 280 285 



Arg Gly Pro Asn Gin Asn Phe Giy Gly Gly GLu Met Leu Lys Leu Gly 
290 295 300 



Thr Ser Asp Pro Gin Phe Pro lie Leu Ala Glu Leu Ala Pro Thr Ala 
305 310 315 320 



Gly Ala phe Phe Phe Gly Ser Arg Leu Glu Leu Ala Lys Val Gin Asn 
325 330 . . ' 335 



Leu Ser Gly Asn Pro Asp Glu Pro Gin Lys Asp Val Tyr Glu Leu Arg 
340 345 350 



Tyr Asn Gly Ala lie Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 360 365 



lie Met Lys Val Leu Asn Glu Asn Leu Asn Ala Tyr Gin Gin Gin Asp 
370 375 380 



Gly Met Met Asn Met Ser Pro Lys Pro Glh Arg Gin Arg Gly His Lys 
385 390 . 395 400 



Asn Gly Gin Gly Glu Asn Asp Asn lie Ser Val Ala Val Pro Lys Ser 
405 ■ 410 415 



Arg Val Gin Gin Asn Lys Ser Arg Glu Leu Thr Ala Glu Asp lie Ser 
420 425 430 • 



Leu Leu Lys Lys Met Asp Glu Pro Tyr Thr Glu Asp Thr Ser Glu lie 
435 440 445 



<210> 50 ' 
<211> 449 
•<212> PRT 

<213> porcine hemagglutinating encephalomyelitis 
<400> 50 

Met Ser Phe Thr Pro Gly Lys Gin Ser Ser Ser Arg Ala Ser Ser Gly 
15' 10 15 



Asn Arg Ser Gly Asn Gly lie Leu Lys Trp Ala Asp Gin Ser Asp Gin 
20 25 30 



Ser Arg Asn Val Gin Thr Arg Gly Arg Arg Val Gin Ser Lys Gin Thr 
35 40 45 
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Ala Thr Ser Gin Gin Pro Ser Gly Gly Thr Val Val Pto Tyr Tyr Ser 
50 55. 60 . 



Trip Phe Ser- Gly He Thr Gin Phe Gin Lys Gly Lys Glu The Glu Phe 
65 .70 ■ ' 75 ' 80 



Ala. Glu Gly Gin Gly Val Pro He Ala Pro Gly Val Pro Ser Thr Glu 
85 90 . 95 



Ala Lys Gly Tyr Trp Tyr Arg His Asn Arg Arg Ser Phe Lys Thr Ala 
100 ^ 105 ' 110 



Asp Gly Asn Gin Arg Gin Leu' Leu Pro Arg Trp Tyr Phe Tyr Tyr Leu 
115 120 125 

Gly Thr Gly Pro His Ala Lys Asp Gin Tyr Gly Thr Asp He Asp Gly 
130 135 140 



Val Phe Trp Val Ala Ser Asn Gin Ala Asp He Asn Thr Pro Ala Asp 
145 150 155 ' 160 



He Val Asp Arg Asp Pro Ser Ser Asp Glu Ala He Pro Thr Arg Phe 
165 • 170 175 



Pro Pro Gly Thr Val Leu Pro Gin Gly Tyr Tyr He Glu Gly Ser Gly 
180 185 • 190 



Arg Ser Ala Pro Asn Ser Arg Ser. Thr Ser Arg Ala Pro Asn Arg Ala 
195 200 .205 



Pro Ser Ala Gly Ser Arg Ser Arg Ala Asn Ser Gly Asn Arg Thr Ser 
210 215 220 



Thr Pro Gly Val Thr Pro Asp Met Ala Asp Gin He' Ala Ser Leu Val 
225 230 ' 235 240 



Leu Ala Lys Leu Gly Lys Asp Ala Thr Lys Pro Gin Gin Val Thr Lys 
245 250 255 



Gin Thr Ala Lys Glu Val Arg Gin Lys He Leu Asn Lys Pro Arg Gin 

260 265 270 



Lys Arg Ser Pro Asn Lys Gin Cys Thr Val Gin Gin Cys Phe Gly Lys 
275 280 285 
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Arg Gly Pro Asn Gin Asn Phe Gly Gly Gly Glu Met Leu Lys Leu Giy 
290 295 . 300- 



Thr Ser Asp Pro Gin Phe Pro He Leu Ala Glu Leu Ala Pro Thr Ala 
305 310 315 320 



Gly Ala Phe Phe Phe Gly Ser Arg Leu Glu Leu Ala Lys Val Gin Asn 
325 , 330 335 



Leu Ser Gly Asn Pro Asp Glu Pro Gin Lys Asp Val Tyr Glu Leu Arg 
340 345 350 



Tyr Asn Gly Ala He Arg Phe Asp Ser Thr Leu Ser Gly Phe Glu Thr 
355 360 365 



He Met Lys Val Leu Asn Gin Asn Leu Asn Ala Tyr Gin* His Gin Glu 
370 375 380 



Asp Gly Met Met Asn He Ser Pro Lys Pro Gin Arg Gin Arg Gly Gin 
385 390 395 400 



Lys Asn Gly Gin Val Glu Asn Asp Asn Val Ser Val Ala Ala Pro Lys 
405 410 415 



Ser Arg Val Gin Gin Asn Lys Ser Arg Glu Leu Thr Ala Glu Asp He 
420 425 430 



Ser Leu Leu Lys Lys Met Asp Glu Pro Tyr Thr Glu Asp Thr Ser Glu 
435 440 445 



He 



<210> 51 

<211> 409 

<212> PRT 

<213> turkey coronavirus 

<400> 51 

Met Ala Ser Gly Lys Ala Thr Gly Lys Thr Asp Ala Pro Ala Pro He 

1 ■ 5 . 10 15 



He Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Ser Gly Asn 
20 25 30 



Ala Ser Trp Phe Gin Ser He Lys Ala Lys Lys Leu Asn Ser Pro Gin 
35 40 , 45 



85 



wo 2004/096842 



PCT/CA2004/000626 



Pro Lys Phe Glu Gly Ser Gly Val Pro Asp Asn Glu .Asn lie .Lys Thr 
50 55 60 



Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Phe Lys Pro Gly 
65 70 . • 75 • 80 



Lys Gly Gly Arg Lys Pro Val" Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 
85 90 95 



Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Thr Gin Asp Gly 
100 105 110 



lie Val Trp Val Ala Ala Lys Gly Ala Asp Val Lys Ser Arg Ser Asn 
115 ' 120 125 



Gin Gly Thr Arg Asp Pro Asp Lys -Phe Asp Gin Tyr Pro Leu Arg Phe 
130 135 140 



Ser Asp Gly Gly Pro Asp Ser Asn Phe Arg Trp Asp Phe He Pro Leu 
145 150 155 160 



His Arg Gly Arg Ser Gly Arg Ser -Thr Ala Ala Ser Ser Ala Ala Ser 
165 170 175 



Ser Arg Ala Pro Ser Arg Asp Gly Ser Arg Gly Arg Arg Ser Gly Ser 
180 185 190 



Glu Asp Asp Leu lie Ala Arg Ala Ala Lys He He Gin Asp Gin Gin 
195 200 - 205 



Lys Lys Gly Ser Arg He Thr Lys Ala Lys Ala Asp Glu Met Ala His 
210 215 220 



Arg Arg Tyr Cys Lys Arg Thr Val Pro Pro Gly Tyr Lys Val Asp Gin 
225 230 235 240 



Val Phe Gly Pro Arg Thr Lys Gly Lys Glu Gly Asn Phe Gly Asp Asp 
245 250 255 



Lys Met Asn Glu Glu Gly He Lys Asp Gly Arg Val Thr Ma Met Leu 
260 265 270 



Asn Leu Val Pro Ser Ser His Ala Cys Leu Phe Gly Ser T^g Val Thr 
275 280 285 



86 



wo 2004/096842 



PCT/CA2004/000626 



Pro Lys Leu. Gin Pro- Asp iSly Leu His, Leu Arg Phe Glu Phe Thr Thr 
290 295 300 



Val Val Pro Arg Asp Asp Pro Gin Phe Asp Asn Tyr Val Thr .lie Cys 

305 310 315 320 

Asp Gin Cys Val Asp Gly He Gly Thr Arg Pro Lys Asp Asn Glu Pro 

325 330 • 335 



Arg Pro Lys Ser Arg Pro-Ser Ser Arg Pro Ala Thr Arg Gly Asn Ser 
340 - 345 350 



Pro Ala Pro Arg Gin Gin Arg Pro Lys Lys Glu Lys Lys Pro Lys Lys 
355 360 365 



Gin Asp Asp Glu Val Asp Lys Ala Leu Thr . Ser Asp Glu Glu Arg Asn 
370 375 380 



Asn Ala Gin Leu Glu Phe Asp Asp Glu Pro Lys Val 11^ Asn Trp Gly 
385 . 390 . 395 . 400 



Asp Ser Ala Leu Gly. Glu Asn His Leu 
405 



<210> 52 • . • 

<211> 1173 

<212> PRT 

<213> Human coronavirus 22 9E 

<:400> 52 

Met Phe Val Leu Leu Val Ala Tyr Ala Leu Leu His He Ala Gly Cys 
1 5 ■ 10 15 • 



Gin- Thr Thr Asn Gly Leu Asn Thr Ser Tyr Ser Val Cys Asn Gly Cys 
20 25 30 



Val Gly Tyr Ser Glu Asn Val Phe Ala Val Glu Ser Gly Gly Tyr He 
'35 40 45 



Pro Ser Asp Phe Ala Phe Asn Asn Trp Phe Leu Leu Thr Asn Thr Ser 
50 55 60 



Ser Val Val Asp Gly Val Val Arg Ser Phe Gin Pro, Leu Leu Leu Asn 
65 70 75 80 



Cys Leu Trp Ser Val Ser Gly Leu Arg Phe Thr Thr Gly Phe Val Tyr 



87 



wo 2004/096842 



PCT/CA2004/000626 



85 90 95 



Phe Asn Gly Thr Gly Arg Gly Asp Cys Lys Gly Phe Ser Ser Asp Val 
• 100 ■ 105.. 110 



Leu Ser Asp Val He Arg Tyr Asn Leu Asn Phe Glu Glu Asn Leu Arg 
115 120 125 



Arg. Gly Thr He Leu Phe Lys Thr Ser Tyr Gly Val Val Val Phe Tyr 
130 135 140 



Cys Thr Asn Asn Thr Leu Val Ser Gly Asp Ala His He Pro Phe Gly 
145 150. 155 160 



Thr Val Leu Gly Asn Phe Tyr Cys Phe Val Asn Thr Thr He Gly Thr 
165 170 175 



Glu Thr Thr Ser Ala Phe Val Gly Ala Leu Pro Lys Thr Val Arg Glu 
180 185 190 



Phe Val lie Ser Arg Thr Gly His Phe Tyr -He Asn Gly Tyr Arg Tyr 
195 200 205 



Bhe 'Shr I^eu Gly Asn Val Glu- Ala Val Asn PHe Asn 'Val Thr Thr Ala 
210 215 220 



Glu Thr Thr Asp Phe Phe Thr Val Ala Leu Ala Ser Tyr Ala Asp Val 
225 230 • 235' 240 



Leu Val Asn Val Ser Gin Thr Ser lie Ala Asn He He Tyr Cys Asn 
245 250 255 



Ser Val He Asn' Arg Leu Arg Cys Asp Gin Leu Ser Phe Tyr Val Pro 
260 265 270 



Asp Gly Phe Tyr Ser Thr Ser Pro He Gin" Ser Val Glu Leu Pro Val 
275 280 285 



Ser He Val Ser Leu Pro Val Tyr His Lys His Met Phe He Val Leu 
290 295 300 



Tyr Val Asp Phe Lys Pro Gin Ser Gly Gly Gly Lys Cys Phe Asn Cys 
305 310 315 320 



Tyr Pro Ala Gly Val Asn He Thr Leu Ala Asn Phe Asn Glu Thr Lys 
325 330 335 
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Gly Pro Leu Cys Val Asp Thr Ser His Phe .Thr Thr Lys Tyr Val Ala 
340 345 350 



Val Tyr Ala' Asn Val Gly" Arg Trp Ser Ala Ser lie Asn Thr Gly Asn 
355 360 365 



Cys Pro Phe Ser Phe Gly Lys Val Asn Asn Phe Val Lys Phe Gly Ser 
370 . 375 380 , 



Val Cys Phe, Ser Leu Lys Asp lie Pro Gly Gly Cys. Ala Met Pro lie 
385 . 390 395 400 

Val Ala Asn Trp Ala Tyr Ser Lys Tyr Tyr Thr lie Gly Thr Leu Tyr 
405 410 415 



Val Ser Trp Ser Asp Gly Asp Gly lie Thr Gly Val Pro Gin Pro Val 
420 425* .430 



Glu Gly Val Ser Ser Phe Met Asn Val Thr Leu Asp Lys Cys Thr Lys 
435 , 440 . 445 

Tyr Asn lie Tyr Asp Val Ser Gly Val Gly Val lie Arg Val Ser Asn 
$50 ; 455 460 



Asp Thr Phe Leu Asn Gly lie Thr Tyr Thr -Ser Thr Ser Gly Asn Leu 
465 " 470 475 480 



Ijjeu Gly Phe Lys Asp Val Thr Lys Gly Thr lie Tyr Ser lie Thr Pro 
485 490 495 . 



Cys Asn Pro Pro Asp Gin Leu Val Val Tyr Gin Gin Ala Val Val Gly 
500 505 510 



Ala Met Leu Ser Glu Asn Phe • Thr Ser Tyr Gly Phe Ser Asn Val Val 
515 520 525 



Glu Leu Pro Lys Phe Phe Tyr Ala Ser Asn Gly Thr Tyr Asn Cys Thr 
530 535 540 



Asp Ala Val Leu Thr Tyr Ser Ser Phe Gly Val Cys Ala Asp Gly Ser 
545 550 555 560 



lie lie Ala Val Gin Pro Arg Asn Val Ser Tyr Asp Ser Val Ser Ala 
565 570 575 
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lie Val Thr Ala Asn Leu Ser lie Pro Ser Asn Trp Thr lie Ser Val 

580 ' . - . 585 ' 590 



Gin Val Glu Tyr Leu Gin He Thr Ser Thr' Pro He Val Val Asp Cys 
595 600 605 



Ser Thr Tyr Val Cys Asn Gly Asn Val Arg Cys Val Glu lieu Leu Lys 
610 615 620 



Gin Tyr Thr Ser Ala Cys - Lys Thr He Glu Asp Ala Leu Arg Asn Ser 
625 . -630 635 . 640 



Ala Arg Leu Glu Ser Ala Asp Val Ser Glu Met Leu Thr Phe Asp Lys 
645 6S0 655 



Lys Ala Phe Thr Leu Ala Asn Val Ser Ser. Phe Gly Asp Tyr Asn Leu 
660 665 670 



Ser Ser Val. He Pro Ser Leu Pro Thr Ser Gly Ser Arg Val Ala -Gly 
675 68.0 685 



Arg Ser Ala He Glu Asp He Leu Phe Ser Lys He Val Thr Ser Gly 
690' 695 700' 



Leu Gly Thr Val Asp Ala Asp- Tyr Lys Asn Cys Thr Lys. Gly Leu Ser 
705 . 710 715 720 



He Ala Asp Leu Ala Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu 
725 730 735 



Pro Gly Val Ala. Asp Ala Glu' Arg Met Ala Met Tyr Thr Gly Ser Leu 
740 745 750 



He Gly Gly He Ala Leu Gly Gly Leu Thr Ser Ala Val Ser He Pro 
755 760 765 



Phe Ser Leu Ala He Gin Ala Arg Leu Asn Tyr Val Ala Leu Gin Thr 
770 775 780 



Asp Val Leu Gin Glu Asn Gin Lys He Leu Ala Ala Ser Phe Asn Lys 
785 790 795 800 



Ala Met Thr Asn He Val Asp Ala Phe Thr Gly Val Asn Asp Ala He 
805 810 815 



90 



wo 2004/096842 



PCT/CA2004/000626 



Thr Gin Thr Ser Gin Ala Leu Gin Thr Val Ala Thr Ala I^eu Asn Lys 
820 825 830.' 



lie Gin Asp Val Val Asn Gin Gin Gly Asn Ser Leu Asn His Leu Thr 
•835- 840 845 • 



Ser Gin Leu Arg Gin Asn Phe Gin Ala lie Ser Ser Ser He Gin Ala 
. 850 . 855 860 



He Tyr Asp Arg Leu Asp Thr He Gin Ala Asp Gin Gin Val Asp Arg 
865 870 875 . 880 



Leu He Thr Gly Arg Leu Ala Ala Leu Asn Val Phe Val Ser His Thr 
885 ■ 890 895 



Leu Thr Lys Tyr Thr Glu Val Arg Ala Ser Arg Gin Leu Ala Gin Gin 
900 905 910 



Lys Val Asn Glu Cys Val Lys Ser Gin Ser Lys Arg Tyr Gly Phe Cys 
915 920 925 



Gly Asn Gly Thr His He Phe Ser lie Val A0n Ala Ala Pro Glu Gly 
930 . . 935 940 



Leu Val Phe Leu His Thr Val Leu Leu Pro Thr Gin Tyr Lys Asp Val 
945 950 955 960 



Glu Ala Trp Ser Gly Leu Cys Val Asp Gly Thr Asn Gly Tyr Val Leu 
965 , 970 975 



Arg Gin Pro Asn Leu Ala Leu Tyr Lys Glu Gly Asn Tyr Tyr Arg He 
980 985 990 



Thr Ser Arg He Met Phe Glu Pro Arg He Pro Thr Met Ala Asp Phe 
995 1000 1006 



Val Gin He Glu Asn Cys Asn Val Thr Phe Val Asn He Ser Arg 
1010 1015 1020 



Ser Glu Leu Gin Thr He Val Pro Glu Tyr He Asp Val Asn Lys 
1025 1030 1035 



Thr Leu Gin Glu Leu Ser Tyr Lys Leu Pro Asn Tyr Thr Val Pro 
1040 1045 1050 



Asp Leu Val Val Glu Gin Tyr Asn Gin Thr He Leu Asn Leu Thr 
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1055 1060 1065 

Ser Glu lie Ser Thr Leu GIu Asn Lys Ser Ala Glu Leu Asn Tyr . 
1070 1075 1080 ■; 

Thr Val Gin Lys Leu Gin Thr Leu lie Asp Asn lie Asn Ser Thr 
1085 1090 1095 

Leu Val Asp Leu Lys Trp Leu Asn Arg Val Glu Thr Tyr lie Lys 
1100 1105 '. 1110 

Trp. Pro Trp Trp Val Trp Leu Cys lie Ser Val Val Leu lie Phe • 
1115 1120 1125 

Val Val Ser Met Leu Leu Leu Cys Cys Cys Ser Thr Gly Cys Cys 
1130 1135 1140 

Gly Phe Phe * Ser Cys Phe Ala Ser Ser lie Arg Gly Cys Cys Glu ' 
1145 1150* . 1155 

Ser Thr Lys Leu Pro Tyr Tyr Asp Val Glu Lys He ■ His He Gin 
1160 1165 ' 1170 

<210V' 53 • * • . 

<211> 1164 
<212> PRT 

<213>. Avian infectious bronchitis virus • ' ' 

<400> 53 

l^et Leu Gly Lys Ser Leu Phe Leu Val Thr He Leu Cys Ala Leu Cys- 
1 5 ' 10 15 . 

Ser Ala Asn Leu Phe Asp Pro Ala Asn Tyr Val Tyr Tyr Tyr Gin Ser 
20 25 30 

Ala Phe Arg Pro Ser Asn Gly Trp His Leu Gin Gly Gly Ala Tyr Ala 

35 40 45 " ' • 

Val Val Asn Ser Ser Asn Tyr Ala *Asn Asn Ala Gly Ser Ala Ser Glu 
50 55 ■ 60 ' . 



Cys Thr Val Gly Val He Lys Asp Val Tyr Asn Gin Ser Ala Ala Ser 
65 70 75 80 



He Ala Met Thr Ala Pro Leu Gin Gly !^et Ala Trp Ser X^ys Ser Gin 
85 90 95 
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Phe Cys Ser Ala His Cys Asp Phe Ser Glu He Thr Vai Phe Val Thr 
100 - 105 • . 110 



His Cys-Tyr Ser Ser Gly Ser Gly Ser Cys Pro He Thr Gly Met He 
115 / 120 125 



Ala Arg Gly His He Arg He Ser Ala Met Lys Asn Gly Ser Leu Phe 
130 135 140 



Tyr Asn Leu Thr Val Ser Val Ser Lys Tyr Pro Asn Phe Lys Ser Phe 

145 150 155 160 

Gin cys Val Asn Asn Phe Thr- Ser Val Tyr Leu Asn Gly Asp Leu Val 

165 170 175 



Phe Thr Ser Asn Lys Thr Thr Asp Val Thr Ser Ala Gly Val Tyr Phe 
180 185 190 



Lys Ala Gly Gly Pro Val Asn Tyr S^r He Met Lys Glu Phe Lys Val 
195 200 205 



Leu Ala Tyr Phe Val Asn Gly Thr Ala Gin Asp Val He Leu Cys Asp 
210. 215 » 220 



Asn Ser Pro Lys Gly Leu Leu Ala Cys Gin Tyr Asn Thr Gly Asn Phe 
225 230 ' 235 , 240 



Ser Asp Gly Phe Tyr Pro Phe Thr Asn Ser Thr Leu Val Arg Glu Lys 
245 250 255 



Phe He Val Tyr Arg Glu' Ser Ser Val Asn Thr Thr Leu Ala Leu Thr 
260 265 270 



Asn Phe Thr Phe Thr Asn Val Ser Asn Ala Gin Pro *Asn Ser Gly Gly 
275 . 280 285 



Val His Thr Phe His Leu Tyr Gin Thr Gin Thr Ala Gin Ser Gly Tyr 
290 295 .300 



Tyr Asn Phe Asn Leu Ser Phe Leu Ser Gin Phe Val Tyr Lys Ala* Ser 
305 310 315 320 



Asp Tyr Met Tyr Gly Ser Tyr His Pro He Cys Ala Phe Arg Pro Glu 
325 330 335 
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Thr lie Asn Ser Gly Leu Trp Phe ^sn Ser Leu Ser Val Ser Leu Thr 
■ 340 345 • ■ 350 



Tyr Gly Pro Leu Gin Gly Gly Tyr Lys Gin Ser Val Phe -.Ser Gly Lys 
355 360 365 



Ala Thr Cys Cys Tyr Ala Tyr Ser Tyr Asn Gly Pro Arg Ala. Cys Lys' 
37 0 375 380 



Gly Val Tyr Ser Gly Glu Leu Ser Arg Asp Phe Giu Cys Gly Leu. Leu 
385 • . 390. 395 400 



Val Tyr Val Thr Lys Ser Asp Gly Ser Arg lie Gin Thr Arg Thr Glu 
405 410 415 



Pro Leu Val Leu Thr Gin His Asn Tyr Asn Asn He Thr Leu Asp Lys 
420 425 430 • 



Cys Val Ala Tyr Asn He Tyr Gly Arg Val Gly Gin Gly Phe He Thr 
435 440 445 



Asn Val Thr Asp Ser Val Ala Asn Phe -Ser Tyr Leu Ala Asp Gly Gly 
450 ■ 455 . 460 



Leu Ala He Leu Asp Thr Ser Gly Ala He Asp Val Phe Val Val Gin 
465 470 475 480 



Gly Ser Tyr Gly Leu Asn Tyr Tyr Lys Val Asn Pro Cys Glu Asp Val 
485 490 495 



Asn Gin Gin Phe Val Val Ser Gly Gly Asn He Val Gly He Leu Thr 
500 505 510 



Ser. Arg Asn Glu Thr Gly Ser Glu Gin Val Giu Asn Gin Phe Tyr Val 
515 520 525 



Lys Leu Thr Asn Ser Ser His Arg Arg Arg Arg Ser He Gly Gin Asn ■ 
530 535 540 



Val Thr Ser Cys Pro Tyr Val Ser Tyr Gly Arg Phe Cys He Glu Pro 
545 - 550 555 560 



'Asp Gly Ser Leu Lys Met He Val Pro Glu Glu Leu. Lys Gin Phe Val 
565 570 575 



Ala Pro Leu Leu Asn He Thr Glu Ser Val Leu He Pro Asn Ser Phe 
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580 585 590 



Asn L-eu Thr Val Thr Asp Glu Tyr lie Gin Thr Arg Met Asp Lys Val 
595- 600 605 



Gin lie Asn Cys Leu Gin Tyr Val Cys Gly Asn Ser Leu Glu Gys Arg 
610 615 620 



Lys Leu Phe Gin Gin Tyr Gly Pro Val Cys Asp Asn lie Leu Ser Val 

625 630 635 640 



Val Asn Ser Val Se£ Gin Lys Glu Asp Met Glu Leu Leu. Ser Phe Tyr 
645 650 655 



Ser Ser Thr Lys Pro Lys* Gly Tyr Asp Thr Pro Val Leu Ser Asn Val 
660 665 670 



Ser Thr Gly Glu Phe Asn lie Ser Leu Leu Leu Thr Pro Pro Ser Ser 
675 680 685 



Pro Ser Gly Arg Ser Phe Val Glu Asp Leu Leu Phe Thr Ser Val Glu 
690 695 700 



T)hr Val Gly Leu Pro Thr Asp Ala Glu Tyr Lys Lys ^Cys Thr Ala Gly 
705 710 ■ 715 720 



Pro Leu Gly Thr Leu Lys Asp Leu lie Cys Ala Arg Glu Tyr Asn Gly 
725 730 ' 735 



Leu Leu Val Leu Pro Pro lie lie Thr Ala Asp Met Gin Thr Met Tyr 
740 745 750 



Thr Ala Ser Leu" Val Gly Ala Met Ala Phe Gly Gly lie Thr Ser Ala 

755 . 760 765 



Ala Ala lie Pro Phe Ala Thr Gin lie Gin' Ala Arg lie Asn His Leu 
770 775 780 



Gly lie Ala Gin Ser Leu Leu Met Lys Asn Gin Glu Lys He Ala Ala 
785 790 795 800 



Ser Phe Asn Lys Ala He Gly His Met Gin Glu Gly Phe Arg Ser Thr 
805 810 815 



Ser Leu Ala Leu Gin Gin Val Gin Asp Val Val Asn Lys Gin Ser Ala 
820 825 830 
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lie Leu Thr Glu Thr Met Asn Ser Leu Asn iys Asn .Phe Gly Ala lie 
- 835 840 • 845 * 



Ser Ser Val He G.in Asp He Tyr Al'a Gin Leu Asp Ala lie Gin Ala 

850 • 855 860 



Asp Ala Gin Val Asp Arg Leu He Thr Gly Arg Leu Ser Ser Leu Ser 
865 870 875 880 



Val Leu Ala Ser Ala Lys Gin Ser Glu Tyr lie Arg Val Ser Gin Gin 
885 890 '895 



Arg Glu Leu Ala Thr Gin Lys He Asn Glu Cys Val Lys Ser Gin Ser 
900 * 905 910 



Asn Arg Tyr Gly Phe Cys Gly Ser Gly Arg His Val Leu Ser He Pro 
915 920 925 



Gin Asn Ala Pro Asn Gly He Val Phe He His Phe Thr Tyr Thr Pro 
930 ■ 935 940 



Glu Thr Phe Val Asn Val Thr Ala I3ie Val Gly Phe Cys Val Asn Pro 
945 950 955 " 960 



Leu Asn Ala Ser Gin Tyr Ala He Val Pro Ala Asn Gly Arg Gly lie 
965 , 970 ■ 975 



Phe He Gin Val Asn Gly Thr Tyr Tyr He Thr Ser Arg Asp Met Tyr 
980 985 990 



Met Pro Arg Asp lie Thr Ala Gly Asp He Val Thr Leu Thr Ser Cys 
995 1000 10.05 



Gin Ala Asn Tyr Val Asn Val Asn Lys Thr Val lie Thr Thr Phe 
1010 1015 1020 



Val Glu Asp Asp Asp Phe Asn Phe Asp Asp Glu Leu Ser Lys Trp 
1025 1030 1035 



Trp Asn Asp Thr Lys His Gly Leu Pro Asp Phe Asp Asp Phe Asn 
1040 1045 1050 



Tyr Thr Val Pro He Leu Asn He Ser Gly Glu He Asp Asn He 
1055 1060 1065 
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Gin Gly Val lie Gin Gly' Leu Asn Asp Ser Leu lie Asn Leu Glu 
1070 1075 . ' 1080 



Glu Leu Ser lie He Lys Thr Tyr lie Lys Trp Pro Trp Tyr Val 
1085 , ■ 1090 1095 



Trp Leu Ala lie Gly Phe .Ala lie lie Ila Phe lie Leu lie- Leu 
1100 1105 1110 



Gly Trp Val Phe Phe Met Thr Gly Cys Cys Gly Cys Cys Cys Gly 
1115 1120 1125 



Cys Phe Gly He lie Pro Leu lie Ser Lys Cys Gly Lys Lys Ser 
1130 1135 1140 



Ser Tyr Tyr Thr Thr Phe Asp Asn Asp Val Val Thr Glu' Gin Tyr 
1145 1150 1155 



Arg Pro Lys Lys Ser Val 
1160 



<210> 54 

<211> 1363 ' ' . 

<212> "prt: 

<213> Bovine coronoavirus 

<400> 54 

Met Phe Leu He Leu Leu He Ser Leu Pro Met Ala Phe Ala Val He 
1 .5 10 15 



Gly Asp Leu Lys Cys Thr Thr Val Ser He Asn Asp Val Asp Thr Gly 
20 25 30 



•Ala Pro Ser He Ser Thr Asp He Val Asp Val Thr Asn Gly Leu Gly 
35 40 45 



Thr Tyr Tyr Val Leu Asp Arg Val Tyr Leu Asn Thr Thr Leu Leu Leu 
50 55 60 



Asn Gly Tyr Tyr Pro Thr Ser Gly Ser Thr Tyr Arg Asn Met Ala Leu 
65 70 75 80 



Lys Gly Thr Leu Leu Leu Ser Arg Leu Trp Phe Lys Pro Pro Phe Leu 
85 90 95 



Ser Asp Phe He Asn Gly He Phe Ala Lys Val Lys Asn Thr Lys Val 

97 



wo 2004/096842 



PCT/CA2004/000626 



100 



105 



110 



lie Lys Lys Gly Val Met Tyr Ser Glu Phe Pro 2Vla He Thr He Gly 
115. 120 125 



Ser Thr Phe Val Asn Thr. Ser Tyr Ser V4I Val Val Gin Pro His Thr 
130 135 140 



Thr Asn Leu Asp Asn Lys Leu Gin Gly Leu Leu Glu He Ser Val Cys 
145 150 155 ' 160 



Gin Tyr Thr Met Cys Glu Tyr Pro- His Thr He Cys His. Pro • Lys • Leu 
165 170 175 



Gly Asn Lys Arg Val Glu" Leu Trp His Trp Asp Thr Gly Val Val Ser 
180 185 190 



"Cys Leu Tyr Lys Arg Asn Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 
195 200 205 



Tyr Phe His Phe Tyr Gin Glu Gly Gly Thr Phe Tyr Ala Tyr Phe Thr 
210 215 220 



Asp Thr Gly Val Val Thr Lys Phe 'Leu Phe Asn Val Tyr Leu Gly Thr 
225 . 2'30 ■ . 235 240 



Val Leu Ser His Tyr Tyr Val Leu Pro Leu Thr Cys Ser Ser Ala Met 
' 245 250 255 



Thr Leu Glu Tyr Trp Val Thr Pro Leu Thr Ser Lys Gin Tyr Leu Leu 
260 265 270 



Ala Phe Asn Gin Asp Gly Val He Phe Asn Ala Val Asp Cys- Lys Ser 
275 280 285 



Asp Phe Met Ser Glu He Lys Cys Lys Thr Leu Ser He Ala Pro Ser 
290 295 300 



Thr Gly -Val Tyr Glu Leu Asn Gly Tyr Thr Val Gin Pro He Ala Asp 
305 310 315 320 



Val Tyr Arg Arg He Pro Asn Leu Pro Asp Cys Asn He Glu Ala Trp 
325 330 335 



Leu Asn Asp Lys Ser Val Pro Ser Pro Leu Asn Trp Glu Arg Lys Thr 
340 345 350 
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Phe Ser Asn Cys Asu Phe Asn Met Ser Ser Leu Met Ser Phe lie Gin 
355 360 365 • 



Ala Asp Ser Phe Thr Cys' Ash Asn lie Asp Ala Ala Lys He Tyr Gly 
3'?0 375 380 



Met Cys Phe Ser Ser He Thr He Asp Lys Phe Ala He Pro Asn Gly 
385 390 395 400 



Arg Lys Val Asp Leu Gin Leu Gly Asn Leu Gly Tyr. Leu Gin Ser Phe 
405 410 .415 



Asn Tyr Arg He Asp Thr Thr Ala Thr Ser Cys Gin Leu Tyr Tyr Asn 
420 425 430 



Leu Pro Ala Ala Asn Val Ser Val Ser Arg Phe Asn Pro Ser Thr Trp 
435 ' 440 445 , 



■Asn Arg Arg Phe Gly Phe Thr Glu Gin Phe Val Phe Lys Pro Gin Pro 
450 ' 455 460 



Val Gly Val Phe Thr His His. Asp Val Val Tyr Ala Gin His Cys Phe 
465 410 ' ^ 415, 480 



Lys Ala Pro Lys Asn Phe Cys Pro Cys Lys Leu Asp Gly Ser Leu Cys 
485 490 495 



Val Gly Asn Gly Pro Gly He Asp Ala Gly Tyr Lys Asn Ser Gly He- 
500 505 510 



Gly Thr Cys Pro Ala Gly Thr Asn Tyr Leu Thr Cys His Asn Ala Ala 
515 520 525 



Gin Cys Asp Cys Leu Cys Thr' Pro Asp Pro He Thr Ser Lys Ser Thr 
530 535 540- 



Gly Pro Tyr Lys Cys Pro Gin Thr Lys Tyr Leu Val Gly He Gly Glu 
545 550 555 560 



His Cys Ser Gly Leu Ala He Lys Ser Asp Tyr Cys Gly Gly Asn Pro 
565 570 575 



Cys Thr Cys Gin Pro Gin Ala Phe Leu Gly Trp Ser Val Asp Ser Cys 
580 585 690 
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Leu Gin Gly Asp. Arg Gys Asn lie Phe Ala Asn Phe He Phe His Asp 

595 600 • .605 



Val Asn- Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Lys ' Ser Asn Thr 
610 615 '620 



Asp He He Leu Gly Val Cys Val Asn Tyr Asp Leu Tyr Gly He Thr 
625 630 - 635 640 



Gly Gin Gly He Phe Val Glu Val Asn Ala Thr Tyr Tyr Asn Ser Trp 
645 .650 . 655 



Gin Asn • Leu Leu Tyr Asp Ser- Asn Gly Asn Leu Tyr Gly Phe Arg Asp 
660 665 670 



Tyr Leu Thr Asn Arg Thr Phe Met He Arg Ser Cys Tyr Ser Gly Arg 
675 ' 680 685 



Val Ser Ala Ala Phe His Ala Asn Ser Ser Glu Pro Ala Leu Leu Phe 
690 695 700 



Arg Asn He Lys Cys Asn Tyr Val Phe Asn Asn Thr Leu Ser Arg Gin 
705 • ' 710- » 715 720 



Leu^ Gin Pro He Asn Tyr Phe Asp Ser Tyr Leu Gly Cys Val Val Asn 
725 730 735 



Ala Asp Asn Ser Thr Ser Ser Val. Val Gin Thr Cys Asp Leu Thr Val 
740 745 750 



Gly Ser Gly Tyr Cys Val Asp Tyr Ser Thr Lys Arg Arg Ser Arg Arg 
755 760 765 



Ala He Thr Thr Gly Tyr Arg Phe Thr Asn Phe Glu Pro Phe Thr Val 
770 775 780 



Asn Ser Val Asn Asp Ser Leu (Slu Pro Val Gly Gly Leu Tyr Glu He 
785 790 795 800 



Gin He Pro Ser Glu Phe Thr He Gly Asn Met Glu Glu Phe He Gin 
805 810 815 



Thr Ser Ser Pro Lys Val Thr He Asp Cys Ser Ala Phe Val Cys Gly 
820 825 830 
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Asp Tyr Ala Ala Cys Lys Ser Gin Leu Val Glu Tyr Gly Ser Phe Cys 
835 840 • 845 



Asp Asn lie Asn Ala He Leu Thr Glu Val Asn Glu Leu- '.Leu Asp Thr 

850 855 ' 860 



Thr Gin Leu Gin Val Ala Asn Ser Leu Met Asn Gly Val Thr Leu Ser 
865 870 .875 880 



Thr Lys Leu Lys Asp Gly Val Asn Phe -Asn Val Asp' Asp He Asn. Phe 
885. 890 895 



Ser Pro Val Leu Gly Cys Leu Gly Ser Ala Cys Asn Lys Val Ser Ser 
900 905 910 



Arg Ser Ala He Glu Asp Leu Leu Phe Ser Lys Val Lys Leu Ser Asp 
915 920 925 



Val Gly Phe Val Glu Ala' Tyr Asn Asn Cys Thr Gly Gly Ala Glu He 
930 935 940 



Arg Asp Leu He Cys Val Gin Ser Tyr Asn Gly He Lys Val Leu Pro ' 
945 . 950 -955 960 , 



Pro Leu Leu Ser Val Asn Gin He Ser Gly Tyr Thr Leu Ala Ala' Thr. 

965 970 • 975 



Ser Ala Ser Leu Phe Pro Pro Leu Ser Ala Ala Val Gly Val Pro Phe 
980 985 990 



Tyr Leu Asn Val Gin Tyr Arg lie Asn Gly He Gly Val- Thr Met Asp; 
995 1000 1005 



Val Leu Ser Gin- Asn Gin Lys Leu He Ala Asn Ala Phe Asn Asn 
1010 1015 1020 



Ala Leu Asp Ala He Gin Glu Gly Phe Asp Ala Thr Asn Ser Ala 
1025 1030 1035 



Leu Val Lys He Gin Ala Val Val Asn Ala Asn Ala Glu Ala Leu 
1040 1045 1050 



Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala He Ser 
1055 1060 1065 



Ser Ser Leu Gin Glu He Leu Ser Arg Leu Asp Ala Leu Glu Ala 
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1070 * 1075 1080 



Gin Ala Gin lie Asp Arg Leu He Asn Gly Arg Leu Thr Ala Leu- 
1085 1090 1095 



Asn Val Tyr Val Bex Gin Gin Leu Ser Asp Ser Thr Leu Val Lys 
1100 1105 1110 



Phe Ser Ala Ala Gin Ala Met Glu Lys Val Asn Glu Cys Val Lys 
. . 1115 1120 1125 



Ser Gin Ser Ser Arg He Asn Phe Cys Gly Asn Gly Asn His He 
1130 1135 1140 



He Ser Leu Val Gin Asn Ala- Pro Tyr Gly Leu Tyr Phe He His 
1145 1150 1155 



Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser Pro 
1160 1165 ■ 1170 , 



Gly Leu Cys He Ala Gly Asp Arg Gly He Ala Pro Lys Ser Gly 
1175 1180 . 1185 



Tyr ?Yie ^ Val Asn Val Asn Asn Thr Trp Met Phe Thr Gly Ser Gly- 
1190 1195 1200 



Tyr Tyr Tyr Pro Glu Pro He Thr Gly Asn Asn Val Val Val Met 
1205 1210 1215 



Ser Thr Cys Ala Val Asn Tyr Thr Lys Ala Pro Asp Val Met Leu 
1220 1225 . 1230 



Asn He Ser Thr Pro Asn Leu His Asp Phe Lys Glu Glu Leu Asp 
•1235 1240 1245 



Glh Trp Phe Lys Asn Gin Thr Ser Val Ala Pro Asp Leu Ser Leu 
1250 1255 1260 



Asp Tyr . He Asn Val Thr Phe Leu Asp Leu Gin Asp Glu Met Asn 
1265 1270 1275 



Arg Leu Gin Glu Ala He Lys Val Leu Asn Gin Ser Tyr He Asn 
1280 1285 1290 



Leu Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro Trp 
1295 1300 1305 
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Tyr Val Trp Leu Leu He Gly Phe Ala Gly Val Ala Met Leu Val 
1310 , 1315 1320 



Leu Leu Phe Phe .He Cys Cys Cys Thr Gly Cys Gly Thr Ser Cys 
1325 . 1330 1335 



Phe Lys He Cys Gly Gly Cys Cys Asp Asp Tyr Thr Gly His Gin 
1340 1345 1350 



Glu Leu Val He Lys Thr Ser His Asp Asp 
1355 1360 



<210> 55 
<211> 1453 
<212> PRT 

<213> canine coronavirus 
<400> 55 

Met He Val Leu He Leu Cys Leu Leu Leu Phe Ser Tyr Asn Ser Val 
15 10 15' 



He Cys Thr Ser Asn Asn Asp Cys Val Gin Gly Asn Val Thr .Gin Leu 
20 • 25 30 



Glu Asn He He Lys Asp Phe Leu Phe His Thr Phe Lys 
40 45 



Glu Glu Pro Ser Val Val Val Gly. Gly Tyr Tyr Pro Thr Glu Val Trp 
50 55 60 



Tyr Asn Cys Ser Arg Ser Ala Thr Thr Thr Ala Tyr Lys Asp Phe Ser 
65 70 75 80 



Asn He His* Ala Phe Tyr Phe Asp Met Glu Ala Met Glu Asn Ser Thr 
85 90 95 



Gly Asn Ala Arg Gly Lys Pro Leu Leu Val His Val His Gly Asp Pro 
100 105 110 



Pro - Gly Asn 
•35 



Val Ser lie He He Tyr He Ser Ala Tyr Arg Asp Asp Val Gin Pro 
115 120 125 



Arg Pro Leu Leu Lys His Gly Leu Leu Cys He Thr Lys Asn Lys He 
130 135 140 
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He Asp Tyr Asn Thr Phe Thr Ser Ala Gin Trp Ser Ala lie Cys Leu 
145 . 150 ' • - 155- ■ ■ ■ • 160 



Gly Asp Asp Arg Lys He Pro Phe Ser Val lie Pro Thr >sp Asn Gly 
165 170 175 



Thr Lys He Phe Gly Leu Glu Trp Asn Asp Asp Tyr Val Thr Ala Tyr 
' 180 185 * - 190 



lie Ser Asp Arg Ser His His Leu Asn' He Asn Asn^Asn Trp Phe .Asn 
• 195 200 205 

Asn Val Thr He Leu Tyr Ser Arg Ser Ser Ser Ala Thr Trp Glri Lys 

210 215 220 



Ser Ala Ala Tyr Val Tyr Gin Gly Val Ser Asn Phe Thr Tyr Tyr Lys 
225 230 .235 240 



Leu Asn Asn Thr Asn Gly Leu Lys Ser Tyr Glu Leu Cys Glu Asp Tyr 
245 250 255-^ 



Glu Tyr Cys Thr Gly Tyr Ala Thr Asn Val Phe Ala Pro Thr Val Gly 
260 265 270 



Gly Tyr He Pro His Gly Phe Ser Phe Asn- Asn Trp Phe Met Arg Thr 
275 280 285 



Asn Ser Ser Thr Phe Val Ser Gly Arg Phe Val Thr Asn Gin Pro Leu 
290 295 300 



Leu Val Asn Cys Leu Trp Pro Val Pro Ser Phe Gly Val Ala Ala Gin 
305 310 315 .-320 



Gin Phe Cys Phe Glu Gly Ala Gin Phe Ser Gin Cys Asn Gly Val Ser 
325 330 335 



Leu Asn Asn Thr Val Asp Val He Arg Phe Asn Leu Asn Phe Thr Ala 

. 340 345 350 

Leu Val Gin Ser Gly Met Gly Ala Thr Val Phe Ser Leu Asn Thr Thr 

355 360, 365 



Gly Gly Val He Leu Glu He Ser Cys Tyr Asn Asp Thr Val Ser Glu 
370 375 380 



Ser Ser Phe Tyr Ser Tyr Gly Glu He Ser Phe Gly Val Thr Asp Gly 
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385 



390 



.395 



400 



Pro Arg Tyr Cys Phe Ala Leu Tyr Asn Gly Thr Ala Leu Lys Tyr Leu 
405 .410 415 



Gly Thr Leu Pro Pro Ser Val Lys Glu lie Ala lie Ser Lys Txp Gly 
420 425 430 



His Phe Tyr lie Asn Gly Tyr Asn Phe Phe Ser Thr Phe Pro lie Asp 
435 440 445 



Cys lie Ser Phe Asn Leu Thr Thr Gly Asp Ser Gly Ala Phe Trp Thr 
450 455 • 460 



lie Ala Tyr Thr Ser Tyr' Thr Asp Ala Leu Val Gin Val Glu Asn Thr 

465 470 475 480 

Ala He Lys Lys Val Thr Tyr Cys Asn Ser His He Asn Asn He Lys 

485 490 495 



Cys Ser Glri Leu Thr Ala Asn Leu Gin Asn Gly Phe Tyr Pro Val Ala 
500 505 510 



Ser Ser Glu Val Gly Leu Val Asn Lys Ser Val Val *Leu Leu Pro Ser 
515 ■ 520 525 



Phe Tyr Ser His Thr Ser Val Asn He Thr He Asp Leu Gly Met Lys 
530 535 -540 



Arg Ser Gly Tyr Gly Gin Pro He Ala Ser Thr Leu Ser Asn He Thr 
545 550 555 560 



Leu Pro Met, Gin* Asp Asn Asn Thr Asp Val Tyr Cys He Arg Ser Asn 
565 570 575 



Arg Phe Ser Val Tyr Phe His Ser Th.r Cys Lys Ser Ser Leu Trp Asp 
580 585 590 



Asp Val Phe Asn Ser Asp Cys Thr Asp Val Leu Tyr Ala Thr Ala Val 
595 600 605 



He Lys Thr Gly Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn Tyr 
610 615 620 



Leu Thr Phe Asn Lys Phe Cys Leu Ser Leu Asn Pro Val Gly Ala Asn 
625 630 635 640 
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Cys Lys Phe Asp Val. Ala Ala Arg Thr Arg Thr Asn Glu Gin Val Val 
. 645 650 655 



Arg Ser Leu' Tyr Val lie Tyr Glu Glu Gly Asp Asn lie Val Gly Val 
660 665 670 



Pro Ser Asp Asn Ser Gly Leu His Asp Leu Ser Val Leu His Leu Asp 
.675 680 ' • 685 



Ser Cys Thr Asp Tyr Asn lie Tyr Gly lie Thr Gly. Val Gly lie lie 
690 695 700 



Arg Gin Thr Asn Ser Thr Leu Leu Ser Gly Leu Tyr Tyr Thr Ser Leu 
705 710 715 720 



Ser Gly Asp Leu Leu Gly Phe Lys Asn Val Ser Asp Gly Val lie Tyr 
725 -730 735 



Ser Val Thr Pro Cys. Asp Val Ser Ala His Ala -Ala Val lie Asp Gly 
740 745 750 



Ala lie Val Gly Ala Met Thr Ser lie Asn Ser Glu Leu Leu Gly Leu 
k' 755 760 765 



Thr His Trp Thr Thr Thr Pro Asn Phe Tyr Tyr Tyr Ser lie Tyr Asn 
770 775 780 



Tyr Thr Asn Glu Arg Thr Arg Gly Thr Ala lie Asp Ser Asn Asp Val- 
785 790 795 800 



Asp Cys Glu Pro lie lie. Thr 'Tyr Ser Asn He Gly Val Cys Lys Asn 
805 810 815 



.Gly Ala Leu Val Phe He Asn Val Thr His Ser Asp Gly Asp Val Gin 
820 . 825 830 



Pro He Ser Thr Gly Asn Val Thr He Pro Thr Asn Phe Thr He Ser 
835 . 840 845 



Val Gin Val Glu Tyr lie Gin Val Tyr Thr Thr Pro Val Ser He Asp ' 
850 855 860 



Cys Ser Arg Tyr Val Cys Asn Gly Asn Pro Arg Cys Asn Lys Leu Leu 
865 870 875 880 
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Thr Gin Tyr Val Ser Ala Cys Gin Thr lie Glu Gin Ala Leu Ala Met 
865 890 .8 95 - 



Giy Ala.Arg- Leu Glu Asn Met Glu He Asp Ser Met Leu Phe Val Ser" 
900 . 905 910 



Glu Asn Ala Leu Lys Leu Ala Ser Val Glu Ala Phe Asn Ser Thr Glu 
915 920 925 

Thr Leu Asp Pro He Tyr Lys Glu Trp Pro Asn He Gly Gly Ser Trp 
930 935 940 

Leu Gly Gly Leu Lys Asp, He' Leu Pro Ser His Asn Ser Lys Arg Lys 
945 950 955 960 

Tyr Arg Ser Ala He Glu Asp Leu Leu Phe Asp Lys Val Val Thr Ser 
965 970 975 

Gly Leu Gly Thr Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr 
980 985 990 

Asp He Ala Asp Leu Val Cys Ala Gin Tyr Tyr Asn Gly He Met- Val 
995- 1000 I 1005 

Leu Pro Gly Val Ala Asn Asp Asp Lys Met Ala Met Tyr Thr Ala 
1010- 1015 1020 • 

Ser Leu Ala Gly Gly He Thr Leu Gly Ser Leu Gly Gly Gly Ala 
1025 1030- ■ , ^ 1035 

Val Ser He Pro Phe Ala He Ala Val Gin Ala Arg Leu Asn Tyr 
1040 1045 ) 1050 



Val Ala Leu Gin Thr Asp Val Leu Asn Lys Asn Gin Gin He Leu 
1055 1060 . "1065 



Ala Asn Ala Phe Asn Gin Ala He Gly Asn He Thr Gin Ala Phe 
1070 1075 1080 



Gly Lys Val Asn Asp Ala lie His Gin Thr Ser Gin Gly Leu Ala 
1085 1090 1095 



Thr Val Ala Lys Val Leu Ala Lys Val Gin Asp Val Val Asn Thr 
1100 1105 1110 
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Gin Gly Gin Ala Leu Ser His Leu Thr Leu Gin -Leu Gin Asn Asn 
1115 1120 ..1125 ■ 

Phe Gin Ala lie Ser Ser Ser .He Ser Asp lie Tyr Asn Arg Leu 
1130 . 1135' 1140 • ■ 



Asp Glu Leu Ser Al^ Asp Ala Gin Val Asp Arg Leu He Thr Gly 
1145 1150 1155 



Arg Leu Thr Ala Leu Asn Ala Phe Val Ser Gin Thr Leu Thr Arg 
1160 1165 1170 



Gin Ala Glu Val Arg Ala Ser Arg Gin Leu Ala Lys Asp Lys Val 
1175 . 1180 1185' 



Asn Glu Cys Val Arg Ser Gin Ser. Gin Arg Phe Gly Phe Cys Gly 
1190 1195 • 1200 



Asn Gly Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly 
1205 1210 1215 



Met He Phe Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr 
1220 1225 1230 



Val Thr Ala Trp Ser Gly He Cys Ala Ser Asp Gly Asp Arg Thr 
1235 • 1240 1245 



Phe Gly Leu Val Val Lys Asp Val Gin Leu Thr Leu Phe Arg Asn 
1250 1255 ' 1260 



Leu Asp Asp Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gin Pro 
1265 1270 1275 



He Val Ala Thr Ser Ser Asp Phe Val Gin He Glu Gly Cys Asp 
1280 1285 1290 



Val Leu Phe Val Asn Ala Thr Val He Asp Leu Pro Ser He He' 
1295 1300" 1305 



Pro Asp Tyr He Asp He Asn Gin Thr Val Gin Asp He Leu Glu 
1310 1315 1320 



Asn Phe Arg Pro Asn Trp Thr Val Pro Glu Leu Pro Leu Asp He 
1*325 1330 1335 

Phe Asn Ala Thr Tyr Leu Asn Leu Thr Gly Glu He Asn Asp Leu 
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1340 1345 1350 



Glu Phe Arg Ser Glu Lys Leu His Asn Thr Thr Vai Giu Leu Ala 
1355 1360 1365 



lie Leu lie Asp Asn lie Asn Asn Thr Leu Val Asn Leu Glu Trp 
1370 1375 1380 



Leu Asn Arg lie Glu Thr Tyr Val Lys Trp Pro Trp Tyr Val Trp 
1385 1390 1395 



Leu Leu lie Gly Leu Val Val He Phe Cys He Pro He Leu Leu 
1400 1405 - 1410 



Phe Cys Cys Cys Ser Thr Gly Cys Cys Gly Cys He Gly Cys Leu 
1415 1420 1425 



Gly Ser Cys' Cys R±& Ser He Cys Ser Arg Arg Gin Phe Glu Ser 
1430 1435 1440 



Tyr Glu Pro He Glu Lys Val His Val His 
1445 1450 



<210> 56 

<211> 1464 

<212> PRT 

<213> Feline infectious peritonitis virus 

<400> 56 

Met He Phe He He Leu Thr Leu Leu Ser Val Ala Lys Ser Glu Asp- 
1 5 10 15 . 

Ala Pro His Gly Val Thr. Leu Pro Gin Phe Asn Thr Ser His Asn Asn 
20 25 30 



Glu Arg Phe Glu Leu Asn Phe Tyr Asn Phe Leu Gin Thr Trp Asp He 
35 40 45 ■ 



Pro Pro Asn Thr Glu Thr lie Leu Gly Gly Tyr Leu Pro Tyr Cys Gly 
' 50 55 60 



Ala Gly Val Asn Cys Gly Trp Tyr Asn Phe Ser Gin Ser Val Gly Gin 
65 70 75 80 



Asn Gly Lys Tyr Ala Tyr He Asn Thr Gin Asn Leu Asn He Pro Asn 
85 90 95 
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Val His Gly Val Tyr Fhe Asp Val Arg Glu His Asn ASn Asp Giy Glu 
100 ' 105 110 



Trp Asp. Asp Arg Asp Lys Val Gly Leu Leu lie Ala lie His Gly Asn 
115 . 120 125 



Ser Lys Tyr Ser Leu Leu Met Val Leu Gin Asp Ala Val Glu Ala Asn 
130 135 140 



Gin Pro His Val Ala Val Lys lie Cys His Trp Lys Pro Gly Asn lie 
145 150 155 160 



Ser Ser Tyr His Ala Phe Ser Val Asn Leu Gly Asp Gly Gly Gin Cys 
{165 170 175 

Val Phe Asn Gin Arg Phe Ser Leu Asp Thir Val Leu Thr Thr Asn Asp 
• 180 185 190 



Phe Tyr Gly Phe Gin Trp Thr Asp Thr Tyr Val Asp lie Tyr Leu Gly 
195 200 205 ^ 



Gly Thr lie Thr Lys Val Trp Val, Asp Asn Asp Trp Ser He Val Glu 
210 215 > 220 



Ala Ser lie Ser Tyr His Trp Asn Arg He Asn Tyr Gly Tyr Tyr Met 
225 230 235 240 



Gin Phe Val Asn Arg Thr Thr Tyr Tyr Ala Tyr Asn Asn Thr Gly Gly 
245 , 250 255 



Ala Asn Tyr Thr Gin Leu Gin Leu Ser Glu Cys His Thr Asp Tyr Cys 
260 265 270 



Ala Gly Tyr Ala Lys Asn Val Phe Val Pro He Asp Gly Lys lie Pro 
275 280 285 . 



Glu Asp Phe Ser Phe Ser Asn Trp Phe Leu Leu Ser Asp Lys Ser Thr 
290 295 300 



Leu Val Gin Gly Arg Val Leu Ser Ser Gin Pro Val Phe Val Gin Cys 
305 310 315 320 



Leu Arg Pro Val Pro Ser Trp Ser Asn Asn Thr Ala Val Val His Phe 
325 330 335 
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Lys Asn Asp Ala Phe Cys Pro Asn Val Thr Ala Asp Val Leu Arg Phe 
340 345 • ■ 350 



Asn Leu Asn Phe Ser Asp Thr. Asp Val Tyr Thr Asp Ser '.Thr Asn Asp 
355 360 365 



Glu Gin Leu Phe Phe Thr Phe Glu Asp Asn Thr Thr Ala Ser lie Ala" 
370 375 380 



Cys Tyr Ser Ser Ala Asn Val Thr Asp . Phe Gin Pro ^ Ala Asn Asn. Ser 
385 ■ .390 395 400 



Val Ser His* lie Pro Phe Gly Lys Thr Ala His Phe Cys Phe Ala Asn 
405 410 415 



Phe Ser His Ser lie Val Ser Arg Gin Phe Leu Gly lie Leu Pro Pro 
420 425 430 • 



Thr Val Arg Glu Phe Ala Phe dly Arg Asp Gly Ser lie Phe Val Asn 
435 440 445 



Gly Tyr Lys Tyr Phe Ser Leu Pro Ala -lie Arg Ser Val Asn Phe Ser 

450 - 455 460 

lie Ser Ser Val Glu Glu Tyr Gly Phe Trp- Thr lie Ala Tyr Thr Asn, 

465 470 -475 480 



Tyr Thr Asp Val Met Vai Asp Val Asn Gly Thr Ala lie Thr Arg Leu • 
485 490 495 



Phe Tyr Cys Asp Ser Pro Leu Asn Arg lie Lys Cys Gin Gin Leu Lys 
500 505 510 



His Glu Leu Pro Asp Gly Phe Tyr Ser Ala Ser Met Leu Val Lys Lys 
515 520 525 



Asp Leu Pro Lys Thr Phe Val Thr Met Pro Gin Phe Tyr His Trp Met 
530 535 540 



Asn Val Thr Leu His Val Val Leu Asn Asp Thr Glu Lys Lys Tyr Asp 
545 550 555 . 560 



He He Leu Ala Lys Ala Pro Glu Leu Ala Ala Leu. Ala Asp Val His 
565 570 575 



Phe Glu He Ala Gin Ala Asn Gly Ser Val Thr Asn Val Thr Ser ^Leu 
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590 



Cys Val Gin Ala Axg Gin Leu 2Vla Leu Phe Tyr Lys Tyr Thr Ser Leu 
595 600 ./ 605 



Gin Gly Leu Tyr Thr Tyr Ser Asn Leu Val Giu Leu Gin Asn Tyr Asp 
610 615 620 • 



Cys Pro Phe Ser Pro Gin Gin Phe Asn Asn Tyr Leu Gin Phe Glu Thr 
625 • 630 635 . 640 



Leu Cys Phe Asp Val Asn Pro Ala Val Ala Gly Cys Lys Trp Ser Leu 
645 650 655 



Val His Asp Val Gin Trp' Arg Thr Gin Phe Ala Thr lie Thr Val Ser 
, 660 665 670 



Tyr Lys His Gly Ser Met lie Thr Thr His Ala Lys Gly His Ser Trp 
675 680 685 



Gly Phe Glh Asp Thr Ser Val Leu Val Lys Asp Glu Cys Thr Asp Tyr 
690 695 700 



Asn lie Tyr Gly Phe Gin Gly Thr Gly lie He Arg Asn Thr Thr Ser 
705 .710 * . 715 ' 720 



Arg Leu Val Ala Gly Leu Tyr Tyr Thr Ser He Ser Gly Asp Leu Leu 
725 730 735 



Ala Phe Lys Asn Ser Thr Thr Gly Glu He Phe Thr Val Val Pro Cys 
740 745 750 



Asp Leu Thr Ala Gin Val Ala Val He Asn Asp Glu He Val - Gly Ala 
755 760 765 



He Thr Ala Val Asn Gin Thr Asp Leu Phe Glu Phe Val Asn Asn Thr 
770 775 780 



Gin Ala Arg Arg Ser Arg Ser Ser Thr Pro Asn Phe Val Thr Ser Tyr 
785 790 795 800 



Thr Met Pro Gin Phe Tyr Tyr He Thr Lys Trp Asn Asn Asp Thr Ser 
805 810 815 



Ser Asn Cys Thr Ser Ala He Thr Tyr Ser Ser Phe Ala He Cys Asn 
820 825 830 
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Thr Gly Glu lie Lya Tyr Val Asn Val Thr His Val Glu lie Val Asp 
835 , 840 845 



Asp Ser lie' Gly Val lie" Lys Pro Val Ser Thr Gly Asn' lie Ser lie 
850 85.5 860 



Pro Lys Asn Phe Thr Val Ala Val Gin Ala Glu Tyr He Gin He Gin 
865 870 375 880 



Val Lys Pro Val Val Val Asp Gys Ala Thr Tyr Val.Cys Asn Gly Asn 
885 890 895 



Thr His Gys Leu Lys Leu Leu Thr Gin Tyr Thr Ser Ala Gys Gin Thr 
900 905 910 



He Glu Asn Ala Leu Asn Leu Gly Ala Arg Leu Glu Ser Leu Met Leu 
915 ■ 920 925 . ' 



Asn Asp Met He Thr Val Ser Asp Arg Gly Leu Glu Leu Ala Thr Val 
930 935 940 



Glu Arg Phe Asn Ala Thr Ala Leu Gly Gly Glu Lys Leu Gly Gly Leu 
945 , * ^ 950 " ■ 955 960 



Tyr Phe Asp Gly Leu Ser Ser Leu Leu Pro Pro Lys He Gly Lys Arg 
,965 970 975 



9er Ala Val Glu Asp Leu Leu Phe Asn Lys Val Val Thr Ser Gly Leu 
980 985 990 

Gly Thr Val Asp Asp Asp. Tyr Lys Lys Cys Ser Ser Gly .Thr Asp Val 
995 1000 1005 



Ala Asp Leu. Val Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu 
1010 1015 1020 



Pro Gly Val Val Asp Gly Asn Lys Met Ser Met Tyr Thr Ala Ser 
1025 • 1030 1035 



Leu He Gly Gly Met Ala Leu Gly Ser He Thr Ser Ala Val Ala 
1040 1045 1050 



Val Pro Phe Ala Met Gin Val Gin Ala Arg Leu Asn Tyr Val Ala 
1055 1060 1065 
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Leu Gin Thr Asp Val Leu Gin Glu Asn Gin Lys lie Leu Ala Asn 
1070 1075 . 1080 



Ala Phe Asn Asn Ala He Gly Asn He Thr Leu Ala Leu Gly Lys 
1085 ■ 1090^ 1095 ' 



Val Ser Asn Ala lie Thr. Thr Thr Ser Asp Gly Phe Asn Ser Met 
1100 1105 1110 



Ala Ser Ala Leu Thr Lys He Gin Ser Val Val Asn Gin Gin Gly 
His 1120 1125 



Glu Ala Leu Ser Gin Leu Thr Ser Gin Leu Gin Lys Asn Phe Gin 
1130 1135 1140 



Ala He Ser Ser Ser He Ala Glu He Tyr Asn Arg Leu Glu Lys 
1145 1150 1155 



Val Glu Ala Asp Algi Gin Val Asp Arg Leu lie Thr Gly Arg Leu 
1160 1165 1170 



Ala Ala Leu Asn Ala Tyr Val Ser Gin Thr- LeU Thr Gin Tyr Ala 
1175 * 1180 1185 



Glu Val Lys Ala Ser Arg Gin He Ala Leu Glu Lys Val Asn Glu 
1190 - 1195 1200 



Cys Val Lys Ser Gin Ser Asn Arg Tyr Gly Phe Cys Gly Asn Gly 
1205 * 1210 1215 



Thr His Leu Phe Ser Leu Val Asn Ser Ala Pro Glu Gly Leu .Leu 
1220 . 1225 1230 



Phe Phe His Thr Val Leu Leu Fro Thr Glu Trp Glu Glu Val Thr 
1235 • 1240 1245 



Ala Trp Ser Gly lie Cys Val Ash Asp Thr Tyr Ala Tyr Val Leu 
1250 1255 1260 



Lys Asp Phe Asp His Ser He Phe Ser Tyr Asn Gly Thr Tyr Met 
1265 1270 1275 



Val Thr Pro Arg Asn Met Phe Gin Pro Arg Lys Pro Gin Met Ser 
1280 1285 1290 
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Asp Phe Val Gin lie Thr Ser Cys Glu Val Thr Phe Leu Asn Met 
1295 1300 1305' 



Thr Tyr Thr Thr Phe Gin Glu .lie Val lie Asp Tyr He Asp He 
1310 1315' 1320 



Asn Lys Thr He Ala Asp Met Leu Glu Gin Tyr Asn Pro Asn Tyr 
1325 . 1330 1335 



Thr Thr Pro Glu Leu Asn Leu Leu Leu Asp He Phe Asn Gin Thr 
1340 1345 1350 



Lys Leu Asn Leu Thr Ala Glu He Asp Gin Leu Glu Gin Arg Ala 
1355 13-60 1365 



Asp Asn Leu Thr Thr He Ala His Glu Leu Gin Gin Tyr He Asp 
1370 1375 - 1380 



Asn Leu Asn Lys Thr Leu Val Asp Leu Asp Trp Leu Asn Arg He 
1385 1390 • 1395 



Glu Thr Tyr Val Ly5 Trp Pro Trp Tyr Val "Trp Leu Leu He Gly 
1400 1405 1410 



Leu Val Val Val Phe Cys He Pro Leu Leu Leu Phe Cys Cys Leu 
1415 • 1420 1425 



Ser Thr Gly Phe Cys Gly Cys Phe Gly Cys Val Gly Ser Cys Cys 
1430 1435 1440 



His Ser Leu Cys Ser Arg Arg Gin Phe Glu Thr Tyr Glu Pro He 
1445 1450 1455 



Glu Lys Val His He His 
1460 



<210> 57 

<211> 1235 
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Met Leu Phe Val Phe He Leu Leu Leu Pro Ser Cys Leu Gly Tyr He 
1 5 10 15 



Gly Asp Phe Arg Cys He Gin Thr Val Asn Tyr Asn Gly Asn Asn Ala 
20 25 30 
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Ser Ala Pro Ser lie Bex Thr Glu Ala Val Asp Val Ser Lys Gly Arg 
35 . ■ ^40 '45 



Gly Thr Tyr Tyr Val I^eu Asp Arg Val Tyr Leu Asn Ala Thr Leu Leu 
50 55 60 



Leu Thr Gly Tyr Tyr Pro Val Asp Gly Ser Asn Tyr Arg Asn Leu Ala 
65 . 70 75 80 



Leu Thr Gly Thr Asn Thr Leu Ser Leu Thr Trp Phe.Lys Pro Pro Phe 
85 90 \95 



Leu Ser Glu Phe Asn Asp Gly lie Phe Ala Lys Val Gin Asn Leu Lys 
100 105 110 



Thr Asn Thr Pro Thr Gly Ala Thr Ser Tyr Phe Pro Thr lie Val lie 
115 120 125 . 



Gly Ser Leu Phe Gly Asn Thr Ser Tyr Thr Val Val Leu Glu Pro Tyr 
130 135 ■ 140 



Asn Asn lie lie Met Ala Ser Val Cys Thr Tyr Thr He Cys Gin Leu 
145 150 155 160 



Pro Tyr Thr Pro Cys I.ys Pro Asn Thr Asn Gly Asn Arg Val He Gly 
165 170 175 



?he Trp His Thr Asp Val Lys Pro Pro He Cys Leu Leu Lys Arg Asn 
180 185 190 



Phe Thr Phe Asn Val Asn Ala 'Pro Trp Leu Tyr Phe His Phe Tyr Gin 
195 200 205 



Gin Gly Gly Thr Phe Tyr Ala Tyr Tyr Ala Asp Lys Pro Ser Ala Thr 
210 215 220 



Thr Phe Leu Phe Ser Val Tyr He Gly Asp He Leu Thr Gin Tyr Phe 
225 230 . 235 240 



Val Leu Pro Phe He Cys Thr Pro Thr Ala Gly Ser Thr Leu Ala Pro 
245 250 255 



Leu Tyr Trp Val Thr Pro Leu Leu Lys Arg Gin Tyr Leu Phe Asn Phe 
260 265 270 
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Asn Glu Lys Gly Val lie Thr Ser Ala Val Asp Cys Ala Ser Ser Tyr 
275 280 .285 



lie Ser Glu lie Lys Cys Lys Thr Gin Ser Leu Leu Pro ' Ser Thr Gly 
290 295 300 



Val .Tyr Asp Leu Ser Gly Tyr Thr Val Gin Pro Val Gly Val Val Tyr 
305 310 315 320, 



Arg Arg Val Pro Asn Leu Pro Asp Cys Lys lie Glu Glu Trp Leu Thr 
325 ' 330 "3*35 



Ala Lys Ser Val Pro Ser Pro Leu Asn Trp Glu Arg Arg Thr Phe Gin 
340 345 350 



Asn Cys Asn Phe -Asn Leu Ser Ser Leu Leu Arg Tyr Val Gin Ala Glu 
355 360 365 



Ser Leu Ser Cys Asn Asn lie Asp Ala Ser Lys Val Tyr Gly Met Cys 
370 375 380 



Phe Gly Ser Val Ser Val Asp Lys Phe Ala lie Pro Arg Ser Arg Gin 
385 390 \ 395 400 



lie Asp Leu Gin lie Gly Asn Ser Gly Phe Leu Gin Thr Ala Asn Tyr 
405 410 415 



Lys lie Asp Thr Ala Ala Thr Ser Cys Gin Leu Tyr Tyr Ser Leu Pro 
420 425 430 



Lys Asn Asn Val Thr He Asn Asn Tyr Asn Pro Ser Ser Trp Asn Arg 
435 440 445 



Arg Tyr Gly Phe Lys Val Asn Asp Arg Cys Gin He Phe Ala Asn He 
450 455 460 



Leu Leu Asn Gly He Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin 
465 470 475 480 



Leu Pro Asn Thr Glu Val Ala Thr Gly Val Cys Val Arg Tyr Asp Leu 
485 490 495 



Tyr Gly He Thr Gly Gin Gly Val Phe Lys Glu Val Lys Ala Asp Tyr 
500 505 510 
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Tyr Asn Ser Trp Gin Ala Leu Leu Tyr Asp Val Asn Gly Asn Leu Asn 
515 520 • 525 



Gly Phe Arg Asp Leu Thr Thr Asn Lys Thr Tyr Thr Ile ;Arg Ser Cys 
530 535 540 



Tyr Ser Gly Arg Val Ser Ala Ala Tyr His Lys Glu Ala Pro Glu Pro* 
545 550 , 555 560 



Ala Leu Leu Tyr Arg Asn lie Asn Cys Ser Tyr Val^ Phe Thr Asn . Asn 
565 570 575 



lie Ser Arg Glu Glu Asn Pro Leu Asn Tyr Phe Asp Ser Tyr Leu Gly 
580 585 590 



Cys Val Val Asn Ala Asp Asn Arg Thr Asp Glu Ala Leu Pro Asn Cys 
595 600 605 



Asn Leu Arg Met Gly Ala Gly Leu Cys Val Asp Tyr Ser Lys Ser Arg 
610 615 620 



Arg Ala Arg Arg Ser Val Ser Thr Gly -Tyr Arg Leu Thr Thr Phe Glu • 
625 630 • -635 640 • 

Pro Tyr *Met Pro Met Leu Val Asn Asp Ser Val Gin Ser Val Gly Gly. 

645 ■ 650 655 

Leu Tyr Glu Met Gin He Pro Thr Asn Phe Thr lie Gly His Ris Glu 
660 " 665 670 

Glu' Phe lie Gin lie Arg Ala Pro Lys Val Thr lie Asp Cys Ala Ala 
675 • 680 685 



Phe Val . Cys Gly Asp Asn Ala Ala Cys Arg Gin Gin Leu Val Glu Tyr 
690 695 700 



Gly Ser Phe ■ Cys Asp Asn Val Asn Ala lie Leu Asn Glu Val Asn Asn 
705 710 • 715 720 



Leu Leu Asp Asn Met Gin Leu Gin Val Ala Ser Ala Leu Met Gin Gly 
725 730 735 



Val Thr lie Ser Ser Arg Leu Pro Asp Gly lie Ser Gly Pro lie Asp 
740 745 750 



Asp He Asn Phe Ser Pro Leu Leu Gly Cys He Gly Ser Thr Cys Ala 
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755 760 765 



Glu Asp Gly Asn Giy Pro Ser Ala lie 2^g Giy Arg Ser A3.a lie Glvi 
770 775 780 



Asp Leu Leu Phe Asp hya Val Lys Leu Ser Asp Val Gly PJae Val Glu 
785 . 790 795 800 



Ala Tyr Asn Asn Cys Thr Gly Gly Gin Glu Val Arg Asp Leu Leu Cys 
805 810 815 



Val Gin Ser Phe Asn Gly lie Lys Val Leu Pro Pro Val Leu Ser Glu 
• 820 825 830 



Ser Gin lie Ser Gly Tyr" Thr Ala Gly Ala Thr Ala Ala Ala Met Phe 
835 , 840 845 



Pro Pro Trp Thr Ala Ala Ala Gly Val Pro Phe Ser Leu Asn Val Gin 
850 855 860 



Tyr Arg He Asn Gly I»eu Gly Val Thr Met Asn Val Leu Ser Glu Asn 
865 870 875 880 



Gin Lys Met ile Ala Ser Ala Phe Asn Asn Ala Leu Gly Ala lie Gin 
. 885 890 , 895 



Glu Gly Phe Asp Ala Thr Asn Ser Ala Leu Gly Lys Ile Gin Ser Val 
900 905 " 910 



Val Asn Ala Asn Ala Glu Ala Leu Asn Asn Leu Leu Asn Gin Leu Ser 
915 920 925 



Asn Arg Phe Gly Ala Ile Ser Ala Ser Leu Gin Glu Ile Leu Thr Arg 
930 . 935 940 



Leu Asp Ala Val Glu Ala Lys Ala Gin lie Asp Arg Leu lie Asn Gly 
945 950 955 960 



Arg Leu -Thr Ala Leu Asn Ala Tyr Ile Ser Lys Gin Leu Ser Asp Ser 
965 970 975 



Thr Leu lie Lys Phe Ser Ala Ala Gin Ala Ile Glu Lys Val Asn Glu 
980 985 990 

Cys Val Lys Ser Gin Thr Thr Arg lie Asn Phe Cys Gly Asn Gly Asn 
995 1000 1005 
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His He Leu Ser Leu Val. Gin Asn Ala "Pro Tyr Gly Leu Cys Phe 
1010 1015 1020 . 



He His Phe Ser .Tyr Val Pro Thr Ser Phe Lys Thr ' Ala Asn Val 
1025 . 1030 1035 



Ser Pro Gly Leu Cys He Ser Gly Asp Arg Gly Leu Ala Pro Lys 
1040 1045 1050 



Ala Gly Tyr Phe Val Gin Asp Asn. Gly Glu Trp Lys Phe Thr Gly 
1055 1060 1065 



Ser Asn Tyr Tyr Tyr Pro Glu Pro He Thr Asp Lys Asn Ser Val 
1070 ' 1075 1080 



Ala Met He Ser Cys. Ala Val Asn Tyr Thr- Lys Ala Pro Glu Val 
1085 1090 1095 



Phe Leu Asn Asn Ser He Pro Asn Leu Pro Asp Phe Lys Glu' Glu 
1100 1105 . 1110 



Leu Asp Lys Trp-Phe Lys Asn Gin* Thr Ser He Ala Pro Asp Leu 
1115 ' 1120 1125 



Ser Leu " Asp Phe Glu Lys Leu Asn Val Thr Phe Leu' Asp Leu Thr 
il30 1135 1140 



Tyr Glu Met Asn Arg He Gin Asp Ala He Lys Lys Leu Asn Glu 
• 1145 1150 1155 



Ser Tyr He Asn Leu Lys Glu Val Gly Thr Tyr Glu Met Tyr Val 
1160 1165 1170 



Lys Trp Pro Trp Tyr Val Trp Leu Leii He Gly Leu Ala Gly Val 
1175 • 1180 1185 



Ala Val Cys Val Leu Leu Phe Phe He Cys Cys Cys Thr Gly Cys 
1190 1195 1200 



Gly Ser Cys Cys Phe Arg Lys Cys Gly Ser Cys Cys Asp Glu Tyr 
1205 1210 . 1215 



Gly Gly His Gin Asp Ser He Val He Ris Asn He Ser Ala His 
1220 1225 1230 
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. Glu Asp 
1235 



<210> 58 

<211> 1363 

<212> PRT 

<213> human coronaviru's 

<400> 58 



Met Phe Leu lie Leu Leu - lie Ser Leu' Pro Met Ala Leu Ala Val. lie 
1 5 . 10 15 . 



Gly Asp Leu Lys Cys Thr Thr Val Ala lie Asn Asp Val Asp Thr Gly 
20 25 30 



Val Pro Ser Thr Ser Thr Asp lie Val Asp Val Thr Asn Gly Leu Gly 
35 40 45 



Thr Tyr Tyr Val Leu Asp Arg Val Tyr Leu Asn ' Thr Thr Leu Leu Leu 
50 55 60 



Asn Gly Tyr Tyr Pro Thr Ser Gly Ser ^Thr Tyr Arg Asn Met Ala Leu 
65 70 75 ,80 



Lys Gly Thr Leu Leu Leu Ser Arg Leu Trp Phe Lys Pro Pro Phe Leu 
85 . 90 95 



Ser Asp Phe lie Asn Gly lie Phe Ala Lys Val Lys Asn Thr Lys Val 
100 105 110 * 



lie Lys His Gly Val Met Tyr Ser Glu Phe Pro Ala lie O^hr He Gly 
115 ■ 120 125 



Ser Thr Phe Val Asn Thr Ser Tyr Ser Val Val Val Gin Pro His Thr 
130 135 140 



Thr Asn Leu Asp Asn Lys Leu Gin Gly Leu Leu Glu lie Ser Val Cys 
145 150 155 160 



Gin Tyr Thr Met Cys Glu Tyr Pro Asn Thr He Cys His Pro Asn Leu 
165 170 175 



Gly Asn Arg Arg Val Glu Leu Trp His Trp Asp- Thr. Gly Val Val Ser 
180 185 190 



Cys Leu Tyr Lys Arg Asn Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 



121 



wo 2004/096842 



PCT/CA2004/000626 



195 . 200 205 



Tyr Phe His Phe Tyr Gin Glu Gly Gly He Phe Tyr ZVI'a Tyr Phe Thr 
210 215 220 



Asp Thr Gly Val V4I Thr Lys Phe Leu Phe Asn Val Tyr Leu Gly Thr 
225 . ■ 230 235 240 



Val Leu Ser Tyr Tyr Tyr Val Met Pro Leu Thr Cys Asn Ser Ala Met 
245 250 255 



Thr Leu Glu Tyr Trp Val Thr Pro Leu Thr Ser Lys Gin Tyr Leu Leu 
260 265 270 



Ala Phe Asn Gin Asp Gly Val He Phe Asn Ala Val Asp Cys Lys Ser 
275 280 285 



Asp Phe Met Ser Glu He Lys Cys Lys Thr Leu Ser He Ala Pro Ser 
290 295 300 



Thr Gly Val Tyr Glu Leu Asn Gly Tyr Thr Val Gin Pro He Ala Asp 
305 310 . 315 320 



Val Tyr Arg Arg lie Pro Asn Leu Pro Asp Cys Asn lie Glu Ala Trp 
. 325 ' 330 335 



Leu Asn Asp Lys Ser Val Pro Ser Pro Leu Asn Trp Glu Arg Lys Thr 
340 345 350 



Phe Ser Asn Cys Asn Phe Asn Met Ser Ser Leu Met Ser Phe He Gin 
355 360 365 



Ala Asp Ser Phe Thr Cys Asn Asn He Asp Ala Ala Lys He- Tyr Gly 
370 . 375 380 



Met Cys Phe Ser Ser lie Thr He Asp Lys Phe Ala He Pro Asn Gly 
385 390 395 400 



Arg Lys Val Asp Leu Gin Leu Gly Asn Leu Gly Tyr Leu Gin Ser Phe 
405 410 415 



Asn Tyr Arg He Asp Thr Thr Ala Thr Ser Cys Gin Leu Tyr Tyr Asn 
420 425 430 



Leu Pro Ala Ala Asn Val Ser Val Ser Arg Phe Asn Pro Ser He Trp 
435 440 445 
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Asn Arg Arg Phe Gly Phe Thr Glu Gin Ser .Val Phe Lys pro Gin Pro 
450 • . 455 460 



Ala Gly Val' Phe Thr Asp' His Asp Val Val Tyr Ala Gin His Cys Phe 
465 * 470 475 .*480 



Lys Ala Pro Thr Asn Phe Cys Pro Cys Lys Leu Asp Gly Ser Leu Cys 
485 490 . 495 



Val Gly Asn Gly Pro Gly He Asp Ala Gly Tyr Lys. Asn Ser Gly lie 
500 505 510 



Gly Thr Cys Pro Ala Gly Thr Asn Tyr Leu Thr Cys His Asn Ala Val 
515 520 525 



Gin Gys Asn Cys Leu Cys Thr Pro Asp Pro He Thr Ser Lys Ser Thr 
530 535 540 



Gly Pro Tyr Lys Cys Pro Gin Thr Lys Tyr Leu Val Gly He Gly Glu 
545 550 555 560 



His Cys Ser Gly Leu Ala He. Lys Ser Asp Tyr Cys ' Gly Gly Asn Pro 
k' ' ' 565 570 575 



Cys Thr Cys Gin Pro Gin Ala Phe Leu Gly Trp Ser Val Asp Ser Cys 
580 585 590 



%^en Gin Gly Asp Arg Cys Asn lie Phe Ala Asn Phe He Leu His Asp- 
595 600 605 



Val Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Lys Ser Asn Thr 
610 615 620 



Asp He He Leu Gly Val Cys Val Asn Tyr Asp Leu Tyr Gly He Thr 
625 630 ' 635 640 



Gly Gin Gly He Phe Val ciu Val Asn Ala Pro Tyr Tyr Asn Ser Trp 
645 650 655 



Gin Asn Leu Leu Tyr Asp Ser Asn Gly Asn Leu Tyr Gly Phe Arg Asp 
660 665 670 



Tyr Leu Thr Asn Arg Thr Phe Met He Arg Sex Cys Tyr Ser Gly Arg 
675 680 685 
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Val Ser Ala Ala Phe His Ala Asn Ser Ser Glu Pro Ala Leu Leu Phe 
690 695 700. 



Arg -Asn- lie Lys Cys J^3n Tyr Val Phe Asn Asn Thr Leu Ser Arg Gin 
705 710 715 "720 



Leu Gin Pro He Asn 'Jyr Phe Asp Ser Tyr Leu Gly Cys Val Val Asn 
725 730 . 735 



Ala Asp Asn Ser Thr Ala' Ser Ala Val Gin Thr Cys Asp Leu Thr Val 
740 745 750 



Gly Ser Gly Tyr Cys Val. Asp Tyr Ser Thr Lys Arg Arg Ser Arg Arg 
755 760 765 



Ala He Thr Thr Gly Tyr Arg Phe Thr Asn Phe Glu Pro Phe Thr Val 
'770 775 780 



Asn Ser Val Asn Asp 5er Leu Glu His Val Gly Gly Leu Tyr Glu He 
785 790 795 ' 800 



Gin He Pro Ser Glu Phe Thr He Gly Asn Met Glu Glu Phe He Gin 
■ • 805 ■ « 810 815 



Thr Ser Ser Pro Lys Val Thr He Asp Cys Ser Ala Phe Val Cys Gly 
820 825 • 830 



Asp Cys Ala Ala Cys Lys Ser Gin Leu Val Glu Tyr Gly Ser Phe Cys 
835 840 ' 845 



Asp Asn He Asn Ala He Leu Thr Glu Val Asn Glu Leu Leu Asp Thr 
850 855 860 



Thr Gin Leu Gin Val Ala Asn Ser Leu Met Asn Gly Val Thr Leu Ser 
865 , . 870 ' 875 880 



Thr Lys Leu Lys Asp Gly Val Asn Phe Asn Val Asp Asp Val Asn Phe 
QQ5 890 895 



Ser Pro Val Leu Gly Cys Leu Gly Ser Glu Cys Asn Lys Val Ser Ser 
900 905 910 



Arg Ser Ala He Glu Asp Leu Leu Phe Ser Lys Val Arg Leu Ser Asp 
915 920 925 
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Val Gly Phe Val Glu Ala Tyr Asn Asn Cys Thr Gly Gly Ala Gly lie 
930 935 - 940 

Arg Asp Leu lie Cys Val Gin Ser Tyr. Asn Gly lie Lys ' Val Leu Pro 
945 950 ■ 955 - " 960 

Pro Leu Leu Ser Asp Asn Gin He Ser Gly Tyr Thr Leu Ala Ala Thr* 
9.65 970 . 975 . 

Ser Ala Asn Leu Phe Pro Pro Trp Ser'-Ala Ala Ala Gly Val Pro Phe 
980 985 • 990 

Tyr Leu Asn Val Glri Tyr Arg He Asn Gly He Gly Val Thr Met Asp 
995 1000 1005 

Val Leu Ser Gin Asn Gin Lys Leu lie Ala Asn Ala Phe Asn . Asn 
' • 1010 1015 1020 

Ala Leu Asp Ala He Gin Glu " Gly Phe Asp Ala Thr Asn Ser Ala • 
1025 10*30 1035 

Leu Val Lys He Gin Ala Val Val As-n Ala Asp Ala Glu Ala Leu » 
1040 1045 ■ 1050 

Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala He Ser 
1055 1060 1065 . 

Ser Ser Leu Gin Glu He Leu Ser Arg Leu Asp Ala Leu Glu Ala 
. 1070 1075 1080 

Gin Ala Gin He Asp Arg Leu He *Asn Gly Arg Leu Thr^Ala Leu 
1085 1090 1095 

Asp Ala Tyr Val Ser Gin Gin Leu Ser Asp Ser Thr Leu Val Lys 
1100 1105 • 1110 

Phe Ser Ala Ala Gin Ala Met Glu Lys Val Asn Glu Cys Val Lys 
1115 1120 1125 



Ser Gin Ser Ser Arg He Asn Phe Cys Gly Asn Gly Asn His He 
1130 1135 1140 . 



He Ser Leu Val Gin Asn Ala Pro Tyr Gly Leu Tyr Phe He His 
1145 1150 1155 



Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser Pro 
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1160 1165 1170 



Gly Leu Cys lie Ala Gly Asp Arg Gly lie Ala Pro Lys Ser Gly 
1175 1180 1185 



Tyr Phe Val Asn Val Asn Asn Thr Trp Met Phe Thr Gly Ser Arg 
1190 1195 1200 



Tyr Tyr Tyr Pro Glu Pro lie Thr Gly Asp Asn Val Val Val Met 
. . 1205 1210 1215 . ' 



Ser Thr Cys Ala Val Asn Tyr Thr Lys Ala Pro Asp Val Met Leu 
1220 1225 1230 



Asn lie Ser Thr Pro Asn Leu Pro Asp Phe Lys Glu Glu Leu Asp 
1235 1240 1245 



Gin Trp Phe Lys Asn Gin Thr Leu Val Ala Pro Asp Leu Ser Leu 
1250 1255 ' ' 1260 



Asp Tyr lie Asn Val' Thr Phe Leu Asp Leu Gin Asp Glu Met Asn 
1265 1270 1275 



Arg lvieu ' Gin Glu Ala lie Lys Val Leu Asn Gin Ser Tyr 11^ Asn 
1280* 1285 1290 



Leu . Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro Trp 
1295 * 1300 1305 



Tyr Val Trp Leu Leu He Gly Phe Ala Gly Val Ala Met Leu Val 
1310 1315 1320 



Leu Leu Phe Phe He Cys Cys Cys Thr Gly Cys Gly Thr Ser Cys 
.1325 * 1330 1335 



Phe Lys Lys Cys Gly Gly Cys Cys Asp Asp Tyr Thr Gly His Gin 
1340 1345 1350 



Glu Leu Val He Lys Thr Ser His Glu Gly 
1355 1360 
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Met Arg Ser Leu lie Tyr Phe Trp Leu Leu Leu Pro Val Leu Pro Thr 
1 5 . 10 • . 15 



Leu Ser Leu Pro Gin hsp Val Thr Arg Cys Gin Ser Thr Thr Asn Phe 
20 , 25 30 



Arg Arg Bhe Phe Ser X^ys Phe Asn Val Gin Ala Pro Ala Val Val Val 
35 40 .45 



Leu Gly Gly Tyr Leu Pro Ser Met Asn Ser Ser Ser Trp Tyr Cys Gly 
50 55 60 



Thr Gly lie Glu Thr Ala Ser- Gly Val His Gly He Phe Leu Ser Tyr 
65 70 75 80 



He Asp Ser Gly Gin Gly Phe Glu He Gly He Ser Gin Glu Pro Phe 
85 90 95 



Asp Pro Ser Gly Tyr Gin Leu Tyr Leu His Lys Ala Thr Asn Gly Asri 
100 105 110 



Thr Asn Ala Thr Ala Arg Leu Arg He Cys Gin Phe Pro Asp Asn Lys 
115 120 » 125 



Thr Leu Gly Pro Thr Val Asn Asp Val Thr Thr Gly Arg Asn Cys Leu 
130 ■ 135 140 



Phe Asn Lys Ala He Pro Ala Tyr. Met Arg Asp Gly Lys Asp He Val 
145 150 . 155 160 



Val Gly He Thr Trp Asp Asn Asp Arg Val Thr Val Phe Ala Asp Lys 
165 no 175 



He Tyr His Phe Tyr Leu Lys Asn Asp Trp Ser Arg Val Ala Thr Arg 
180 185 ■ 190 



Cys Tyr Asn Arg Arg Ser Cys Ala Met Gin Tyr Val Tyr Thr Pro Thr 
195 200 205 



Tyr Tyr Met Leu Asn Val Thr Ser Ala Gly Glu Asp Gly He Tyr Tyr 
210 215 220 



Glu Pro Cys Thr Ala Asn Cys Thr Gly Tyr Ala Ala Asn Val Phe Ala 
225 230 235 240 
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Thr Asp- Ser Asn Gly His He Pro Glu. Gly Phe Ser Phe Asn Asn Trp 
245 , ' 250 255 



Phe Leu Leu Ser Asn Asp Ser Thr Leu Leu His Gly Lys 'Val Val Ser 
260 265 ■ 270 



Asn Gin Pro Leu Leu Val Asn Cys Leu Leu Ala He Pro Lys He Tyr* 
275 280 285 



Gly Leu Gly Gin Phe Phe Ser Phe Asn His Thr Met ^ Asp Gly Val .dys 
290 295 300 



Asn Gly Ala Ala Val Asp Arg Ala Pro Glu Ala Leu Arg Phe Asri He 
305 310 315 320 



Asn Asp Thr Ser Val He Leu Ala Glu Gly Ser He Val Leu His Thr 
325 330 335 



•Ala Leu Gly Thr Asn Leu Ser Phe Val Cys Ser Asn Ser Ser Asp Pro 
340 345 . 350 



Hi's Leu Ala He Phe Ala He Pro Leu Gly Ala Thr Glu Val Pro Tyr 
355 , 360 ' 365 

Tyr Cys Phe Leu Lys Val Asp Thr Tyr Asn Ser Thr Val Tyr Lys Phe 
370 375 380 



Leu Ala Val Leu Pro Ser Thr Val Arg Glu He Val He' Thr Lys Tyr 
385 390 , 395 400 



Gly Asp Val Tyr Val Asn Gly Phe Gly Tyr Leu His Leu Gly Leu Leu 
405 410 415 



Asp Ala Val Thr He Tyr Phe Thr Gly His Gly Thr Asp Asp Asp Val 
420 425 430 



Ser Gly Phe Trp Thr He Ala Ser Thr Asn Phe Val Asp Ala Leu He 
435 440 • 445 



Glu Val Gin Gly Thr Ser He Gin Arg He Leu Tyr Cys Asp Asp Pro 
- 450 455 460 



Val Ser Gin Leu Lys Cys Ser Gin Val Ala Phe Asp. Leu Asp Asp Gly 
465 470 475 480 



Phe Tyr Pro He Ser Ser Arg Asn Leu Leu Ser His Glu Gin Pro He 
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485 490 495 



Ser Phe Val Thr Leu Pro Ser Phe Asn Asp His Ser Phe Val Asn tie 
500 505 . . 510 



Thr Val Ser Ala Ala Phe Gly Gly Leu Ser Ser Ala Asn Leu Val Ala 
515 520 525 



Ser Asp Thr Thr lie Asn Gly Phe Ser Ser Phe Cys Val Asp Thr Arg 
530 • 535 .540 



Gin Phe Thr lie Thr Leu Phe Tyr Asn Val Thr Asn Ser Tyr Gly Tyr 
545 550 555 560 



Val Ser Lys Ser Gin Asp' Ser Asn Cys Pro Phe Thr Leu Gin Ser Val 
565 570 575 



Asn Asp Tyr Leu Ser Phe Ser Lys Phe Cys Val Ser Thr Ser Leu Leu 
580 585 590 



Ala Gly Ala Cys Thr He Asp Leu Phe Gly Tyr Pro Ala Phe Gly Ser 
595 600 605 



Gly Val Lys Leu Thr Ser Leu Tyr Phe Gin Phe ' Thr Lys Gly Glu Leii 
610 615 620 . 



He Thr Gly Thr Pro Lys Pro Leu Glu Gly He Thr Asp Val Ser- Phe 
625 630 635 ■ 640 



Met Thr Leu Asp Val Cys Thr Lys Tyr Thr He Tyr Gly Phe Lys Gly 
645 650 655 



Glu Gly He He Thr Leu Thr Asn Ser Ser He Leu Ala Gly Val Tyr 
660 665 670 



Tyr Thr Ser Asp Ser Gly Gin Leu Leu Ala Phe Lys Asn Val Thr Ser 
675 680 685 



Gly Ala Val Tyr Ser Val Thr Pro Cys Ser Phe Ser Glu Gin Ala Ala 
690 695 700 



Tyr Val Asn Asp Asp He Val Gly Val He Ser Ser Leu Ser Asn Ser 
705 ' 710 715 720 



Thr Phe Asn Asn Thr Arg Glu Leu Pro Gly Phe Phe Tyr His Ser Asn 
725 730 735 
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Asp Gly Ser Asn Cys Th.r Glu Pro Val Leu Val Tyr Ser Asn lie Gly 
740 745 750 



Val Cys Lys' Ser Gly Ser* lie Gly Tyr Val Pro Ser Gin' Tyr Gly Gin 
755 -7 60 '7 65 



Val Lys lie Ala Pro Thr Val Thr Gly Asn He Ser He Pro Thr Asn 
770 775 ' .780 . 



Phe Ser Met Ser He Arg Thr Glu Tyr Leu Gin Leu. Tyr Asn Thr Pro 
785 , 790 795 800 



Val Ser Val Asp Cys Ala Thr Tyr Val Cys Asn Gly Asn Ser Arg Cys 
805 810 815 



Lys Gin Leu Leu Thr Gin Tyr Thr Ala Ala Cys Lys Thr lie Glu Ser 
820 825 • 830 



Ala Leu Gin Leu Ser Ala Arg Leu Glu Ser Val Glu Val Asn Ser Met 
. 835 ' 840 845 



Leu Thr He Ser Glu Glu Ala Leu Gin Leu Ala Thr " He Ser Ser Phe 
950 ■ 8.55 860' 



Asn Gly Asp Gly Tyr Asn Phe Thr Asn Val Leu Gly Ala Ser Val Tyr 

865^ 870 875 880 

i^sp Pro Ala- Ser Gly Arg Val Val Gin Lys Arg Ser Val ile Glu Asp 

885 890 895 . 



Leu Leu Phe Asn Lys Val Val Thr Asn Gly Leu Gly Thr Val Asp Glu 
900 905 910- 



Asp Tyr Lys Arg Cys Ser Asn Gly Arg Ser Val Ala Asp Leu Val Cys 
915 920 925 



Ala Gin Tyr Tyr Ser Gly Val Met Val Leu Pro Gly Val Val Asp Ala 
930 935 940 



Glu Lys Leu His Met Tyr Ser Ala Ser Leu lie Gly Gly Met Ala Leu 
94-5 950 955 960 



Gly Gly Ile Thr Ala Ala Ala Ala Leu Pro Phe Ser Tyr Ala Val Gin 
965 970 975 
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Ala Arg Leu Asn Tyr Leu Ala Leu Gin Thr Asp Val Lfeu Gin Arg Asn 
980 985 990 . 



Gin Gin- Leu Leu Ala Glu Ser Phe Asn Ser Ala lie Gly Asn . lie Thr 
995 , 1000 1005 



Ser Ala Phe Glu Ser Val Lys Glu Ala lie Ser Gin Thr Ser Lys 
1010 1015 1020 



Gly Leu Asn Thr Val Ala His Ala Leu Thr Lys Val Gin Glu Val 
1025 1030 1035 



Val Asn Ser Gin Gly Ser Ala Leu Asn Gin Leu Thr Val Gin Leu 
1040 1045 1050 • 



Gin His Asn Phe Gin Ala lie Ser Ser Ser lie Asp Asp He Tyr 
1055 1060 . 1065 



Ser Arg Leu Asp He Leu Leu Ala Asp Val Gin Val* Asp Arg Leu' 
1070 . 1075 1080 



He Thr Gly Arg Leu Ser Ala Leu Asn Ala Phe Val Ala Gin Thr 
i085. 1090 ' 1095 



Leu Thr Lys Tyr Thr Glu Val Gin Ala Ser Arg Lys Leu Ala Gin 
1100 1105 1110 ■ 



Gin Lys Val Asn Glu Cys Val Lys Ser Gin Ser Gin Arg Tyr Gly 
1115 " 1120 1125 



Phe Cys Gly Gly Asp Gly Glu His He Phe Ser Leu Val Gin Ala 
1130 1135 1140 



Ala Pro Gin Gly Leu Leu Phe Leu His Thr Val Leu Val Pro Gly 
1145 1150 1155 



Asp Phe Val Asn Val Leu Ala He Ala Gly Leu Cys Val Asn Gly 
1160 1165 1170 



Glu He Ala Leu Thr Leu Arg Glu Pro Gly Leu Val Leu Phe Thr 
1175 1180 " 1185 



His Glu Leu Gin Thr Tyr Thr Ala Thr Glu Tyr Phe Val Ser Ser 
1190 1195 1200 
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Arg Arg Met Phe Glu Pro Arg Lys Pro Thr Val -Sef Asp Phe Val 
1205 1210 . 1215 ■ 



Gin lie Glu Ser Cys Val Val , Thr Tyr Val Asn Leu Thr Ser Asp 
1220 . ' 1225" 1230- * 



Gin Leu Pro Asp Val lie Pro Asp Tyr lie Asp Val Asn Lys Thr 
1235 1240 1245 



Leu Asp Glu lie Leu Ala Ser Leu Pro Asn Arg Thr Gly Pro Ser 
1250 1255 1260 



Leu Pro Leu Asp Val Phe Asn Ala Thr Tyr Leu Asn Leu Thr Gly 
12 65 12-7 0 1275 



Glu lie ,Ala Asp Leu Glu Gin Arg Ser Glu Ser Leu Arg Asn Thr 
1280 1285 • 1290 



Thr Glu Glu Leu Arg Ser Leu lie Asn Asn He Asn Asn Thr Leu 
1295 1300 1305 



Val Asp Leu Glu Trp Leu Asn Arg Val Glu Thr Tyr He Lys Trp 
1310 1315 1320 



Pro Trp Trp Val Trp Leu He lie Val He Val Leu He Phe Val' 
1325 1330 1335 



Val Ser Leu Leu Val Phe Cys Cys He Ser Thr Gly Cys Cys Gly 
1340 1345 1350 



Cys Cys Gly Cys Cys Gly Ala Cys Phe Ser Gly Cys Cys Arg Gly 
1355 1360 1365 



Pro Arg Leu Gin Pro Tyr Glu Ala Phe Glu Lys Val His Val Gin 
1370 1375 1380 



<210> 60 

<211> 134 9 

<212> PRT 

<213> porcine hemagglutinating encephalomyelitis virus 

<400> 60 

Met Phe Phe He Leu Leu He Ser Leu Pro Ser Ala Phe Ala Val He 
1 5 10 .15 



Gly Asp Leu Lys Cys Thr Thr Ser Leu He Asn Asp Val Asp Thr Gly 
20 25 30 
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Val Pro Ser He Ser Se.r Glu Val Val Asp .Val Thr Asn Gly Leu Gly 
35 40 45 



Thr Phe Tyr Val Leu Asp' Arg Val Tyr Leu Asn Thr Thr Leu Leu Leu 
50 55 60 



Asfi Gly Tyr Tyr Pro He Ser Gly Ala Thr Phe Arg Asn Met Ala Leu 
65 70 75 80 



Lys Gly Thr Arg Leu Leu Ser Thr Leu Trp Phe X»ys Pro Fro .Phe Leu 
85 90 95 



Ser Pro Phe Asn Asp Gly He Phe Ala Lys Val Lys Asn Ser Arg Phe 
100 105 110 



Ser Lys Asp Gly Val He Tyr Ser Glu Phe Pro Ala He Thr He Gly 
115 120 125 



Ser Thr Phe Val Asn Thr Ser Ty?: Ser He Val Val Glu Pro His Thr 
130 135 140 



Ser Leu He Asn Gly Asn Leu Gin Gly Leu Leu Gin He Ser Val Cys 
145 r 150 155 160 



Gin Tyr Thr Met Cys Glu Tyr Pro His Thr He Cys His Pro Asn Leu 
165 170 175 



Gly Asn Gin Arg He Glu Leu Trp His Tyr Asp Thr Asp Val Val Ser 
180 185 190 



Cys Leu Tyr Arg Arg Asn. Phe Thr Tyr Asp Val Asn Ala Asp Tyr Leu 
195 20*0 205 



Tyr Phe His Phe Tyr Gin Glu Gly Gly Thr Phe Tyr Ala Tyr Phe Thr 
210 215 220 



Asp Thr Gly Phe Val Thr Lys Phe Leu Phe Lys Leu Tyr Leu Gly Thr 
225 230 235 240 



Val Leu Ser His Tyr Tyr Val Met Pro Leu Thr Cys Asn Ser Ala Leu 
245 250 255 



Ser Leu Glu Tyr Trp Val Thr Pro Leu Thr Thr Arg Gin Phe Leu Leu 
260 265 270 
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Ala Phe Asp Gin Asp Gly Val Leu Tyr His Ala Val Asp Cys Ala Ser 
275 280 .285 



Asp Phe. Met Ser Glu lie Met Cys Lys Thr Ser Ser Ile Thr Pro Pro 
290 , 295 300 



Thr Gly Val Tyr Glu Leu Asn Gly Tyr Thr Val Gin Pro Val Ala Thr 
305 310 315 320 



Val Tyr Arg Arg He. Pro Asp Leu Pro Asn Cys Asp He Glu Ala Trp 
325 330 . • 335 



Leu Asn Ser Lys Thr Val. Ser Ser Pro Leu Asn Trp Glu Arg Lys He 
340 345 • 350 



Phe Ser Asn Cys Asn Phe Asn Met Gly Arg Leu Met Ser Phe He Gin 
355 360 365 



Ala Asp Ser Phe Gly Cys Asn Asn He Asp Ala Ser Arg Leu Tyr Gly 
370 375 . 380 



Met Cys Phe Gly Ser He Thr lie Asp Lys Phe Ala He Pro Asn Ser 
385 390 • I - 395 400 



Arg. Lys Val Asp Leu Gin Val Gly Lys Ser Gly Tyr Leu Gin Ser Phe 
405 410 415 



Asn Tyr Lys He Asp Thr Ala Val. Ser Ser Cys Gin Leu Tyr Tyr Ser 
420 425 430 



Leu Pro Ala Ala Asn Val Ser Val Thr His Tyr Asn Pro S^r Ser Trp 
435 440 445 



Asn Arg Arg Tyr Gly Phe Asn Asn Gin Ser Phe Gly Ser Arg Gly Leu 
• 450 455 ' 460 



His Asp Ala Val Tyr Ser Gin Gin Cys Phe Asn Thr Pro Asn Thr Tyr 
465 470 475 480 



Cys Pro Cys Arg Thr Ser Gin Cys He Gly Gly Ala Gly Thr Gly Thr 
485 490 495 



Cys Pro Val Gly Thr Thr Val Arg Lys Cys Phe Ala Ala Val Thr Lys 
500 505 510 
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Ala Thr Lys Gys Thr Cys Trp Cys Gin Pro Asp Pro Ser Thr Tyr Lys 
515 520 • 525 



Gly Val Asn Ala Trp Thr Cys Pro Gin Ser Lys Val Ser- .lie Gin Pro 
530 535 . 540 



Gly Gin His Cys Pro Gly Leu Gly Leu Val Glu Asp Asp Cys Ser Gly 
545 550 555 560 



Asn Pro Cys Thr Cys Ijys Pro Gin Ala Phe lie Gly Trp Ser Ser . Glu 
565. 570 575 



Thr Cys Leu Gin Ash Gly Arg Cys* Asn He Phe Ala Asn Phe He Leu 
580 585 590 



Asn Asp Val Asn Ser Gly Thr Thr Cys Ser Thr Asp Leu Gin Gin Gly 
595 600 605 • . 



Asn Thr He He Thr Thr Asp Val Cys Val Asn Tyr Asp Leu Tyr Gly • 
610 615 620 



He Thr Gly Gin Gly He Leu He Glu Val Asn Ala Thr Tyr Tyr Asn * 
625 . 630 635 ' 640 - 



Ser Trp Gin Asn Leu Leu Tyr Asp Ser Ser Gly Asn Leu Tyr Gly Phe 
645 650 655 



Arg Asp Tyr Leu Ser Asn Arg Thr Phe Leu lie Arg Ser Cys Tyr Ser 
660 • 665 670 



Gly Arg Val Ser Ala Val Phe His Ala Asn Ser Ser Glu Pro Ala Leu 
675 • 680 685 



Met. Phe Arg Asn Leu Lys Cys Ser His Val Phe Asn Asn Thr He Leu 
690 695 700 



Arg Gin He Gin Leu Val Asn Tyr Phe Asp Ser Tyr Leu Gly Cys Val 
705 710 715 720 



Val Asn Ala Tyr Asn Asn Thr Ala Ser Ala Val Ser Thr Cys Asp Leu 
725 730 735 



Thr Val Gly Ser Gly Tyr Cys Val Asp Tyr Val Thr. Ala Leu Arg Ser 
740 745 750 



Arg Arg Ser Phe Thr Thr Gly Tyr Arg Phe Thr Asn Phe Glu Pro- Phe 
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765 



Ala Ala Asn Leu Val Asn Asp Ser lie Glu Pro Val Gly Gly Leu Tyr 
770 775 780 



Glu lie Gin lie Pro Ser Glu Phe Thr lie Gly Asn Leu Glu Glu Phe 
785 790 795 800 



lie Gin Thr Arg Ser Pro Lys Val Thr lie Asp Cys Ala Thr Phe Val 
805 810 815 



Cys Gly Asp Tyr Ala Ala Cys Arg Gin Gin Leu Ala Glu Tyr Gly Ser 
820 8-25 830 



Phe Cys Glu Asn lie Asn' Ala lie Leu Thr Glu Val Asn Glu Leu Leu 
835 840 845 



•Asp Thr Thr Gin Leu Gin Val Ala Asn Ser Jjeu Met Asn Gly Val Thr . 
850 855 860 



Leu Ser Thr Lys lie Lys Asp Gly lie Asn Phe Asn Val Asp Asp lie 
865 870 . 875 880 

Asn Phe Ser Pro Val Leu Gly Cys Leu Gly Ser Glu Cys Asn Arg Ala 
. 885 ' 890 895 



Ser Thr Arg Ser Ala lie Glu Asp Leu Leu Phe Asp Lys Val Lys Leu 
900 905 • 910 



Ser Asp Val Gly Phe Val Gin Ala Tyr Asn Asn Cys Thr Gly Gly Ala 
915 920 925 



Glu lie Arg Asp Leu He Cys Val- Gin Ser Tyr Asn Gly lie- Lys Val 
930 935 940 



Leu Pro Pro Leu Leu Ser Glu Asn Gin lie* Ser Gly Tyr Thr Leu Ala 
945 950 955 960 



Ala Thr Ala Ala Ser Leu Phe Pro Pro Trp Thr Ala Ala Ala Gly Val 
965 970 975 



Pro Phe Tyr Leu Asn Val Gin Tyr Arg He Asn Gly Leu Gly Val Thr 
980 985 990 



Met Asp Val Leu Ser Gin Asn Gin Lys Leu He Ala Ser Ala Phe Asn 
995 1000 1006 
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Asn Ala Leu Asp -Ala lie Gin Glu Gly Phe Asp Ala Thr Asn Ser 
1010 • 1015 1020 



Ala Leu Val Lys lie Gin Ala Val Val Asn Ala Asn Ala Glu Ala 
1025 1030 1035 



Leu Asn Asn Leu Leu Gin Gin Leu Ser Asn Arg Phe Gly Ala lie 
1040. 1045 • 1050 



Ser Ala Ser Leu Gin Glu lie Leu Ser Arg Leu Asp Ala Leu Glu 
1055 1060 1065 



Ala Lys Ala Gin lie Asp Arg Leu lie Asn Gly Arg Leu Thr Ala 
1070 1075 1080 



Leu Asn Ala Tyr Val Ser Gin Gin Leu Ser Asp Ser Thr Leu Val 
1085 . 1090 1095 . 



Lys Phe Ser Ala Ala Gin Ala lie Glu Lys Val Asn . Glu Cys Val 
1100 ' 1105 1110 



Lys Ser Gin Ser Ser Arg lie Asn Phe Cys Gly Asn Gly Asn His 
U15 1120 1125 ' 



lie lie. Ser Leu Val Gin Asn. Ala Pro Tyr Gly Leu Tyr Phe lie 
1130 1135 1140 



ijis Phe Ser Tyr Val Pro Thr Lys Tyr Val Thr Ala Lys Val Ser 
1145 1150 1155 



Pro Gly Leu Cys lie Ala Gly Asp He Gly He Ser Pro Lys Ser 
1160 1165 1170 



Gly -Tyr Phe lie Asn Val Asn Asn Ser Trp Met Phe Thr Gly Ser 
1175 1180 1185 



Ser Tyr Tyr Tyr Pro Glu Pro He Thr Gin Asn Asn Val Val Val 
1190 * 1195 1200 



Met Ser Thr Cys- Ala Val Asn Tyr Thr Lys Ala Pro Asp Leu Met 
1205 1210 1215 



Leu Asn Thr Ser Thr Pro Asn Leu Pro Asp Phe Lys Glu Glu Leu 
1220 1225 1230 
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Tyr Gin Trp Phe Lys Asn Gin Ser Ser Val Ala Pro Asp Leu Ser 
1235 1240 1245 



Leu Asp Tyr lie Asn Val Thr Phe Leu Asp Leu ■ Gin Asp Glu Met 
1250 ' 1255. 1260 



Asn Arg Leu Gin Glu Ala. He Lys Val Leu Asn Gin Ser Tyr He 
1265 1270 1275 



Asn Leu Lys Asp He Gly Thr Tyr Glu Tyr Tyr Val Lys Trp Pro 
1280 1285 1290 



Trp Tyr Val Trp Leu Leu He Gly Leu Ala Gly Val Ala Met Leu 
1295 1300 1305 



Val Leu Leu Phe Phe He Cys Cys Cys Thr Gly Cys. Gly Thr Ser 
1310 1315 1320 



Cys Phe Lys Lys Cys Gly Gly Cys Cys Asp Asp Tyr Thr Gly Els 
1325 1330 . 1335. 



Gin Glu Phe Val He Lys Thr Ser His Asp Asp 
1340 1345 



<210> 61 . • 

<211> 1225 . ■ • 

<212> PRT 

<213> Porcine respiratory coronavirus 

<400> 61 

Met Lys Lys Leu Phe Val Val Leu Val Val Met Pro Leu He Tyr Gly 
1 5-10 15 . 



Asp Lys Phe Pro Thr Ser Val Val Ser Asn Cys Thr Asp Gin Cys Ala 
20 25 30 



Ser Tyr Val Ala Asn Val Phe Thr Thr Gin Pro Gly Gly Phe He Pro* 
35 40 45 



Ser Asp Phe Ser Phe Asn Asn Trp Phe Leu Leu Thr Asn Ser Ser Thr 
50 55 60 



Leu Val Ser Gly Lys Leu Val Thr Lys Gin Pro Leu Leu Val Asn Cys 
65 70 75 80 



Leu Trp Pro Val Pro Ser Phe Glu Glu Ala Ala Ser Thr Phe Cys Phe 
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85 90 ' . 95 



Glu Gly Ala Asp Phe Asp Gin Cys Asn Gly Ala Val Leu Asn Asn Thr 
• 100 105 110 



Val Asp Val lie Arg Phe Asn Leu Asn Phe Thr Thr Asn Val Gin Ser 
115 120 125 



Gly Lys Gly Ala Thr Val Phe Ser Leu Asn Thr Thr Gly Gly Val Thr 
130 135 140 



Leu Glu He Ser Cys Tyr Asn Asp Thr Val Ser Asp Ser Ser Phe Ser 
145 150 155 160 



Ser Tyr Gly Glu lie Pro Phe Gly Val Thr Asn Gly Pro Arg Tyr Cys 
165 170 175 



Tyr Val Leu Tyr Asn Gly Thr Ala Leu Lys Tyr Leu Gly Thr Leu Pro 
180 185 190 



Pro Ser Val Lys Glu He Ala He Ser Lys Trp Gly His Phe Tyr He 
195 200 205 



Asn Gly Tyr Asn Phe Phe Ser Thr Phe Pro He Asp Cys He Ser Phe 
210 215 220 



Asn Leu Thr Thr Gly Asp Ser Asp Val Phe Trp Thr He Ala Tyr Thr 
225 230 235 - 240 



Ser Tyr Thr Glu Ala lieu Val Gin Val Glu Asn Thr Ala He Thr Asn 
245 250 255 



Val Thr Tyr Cys' Asn Ser Tyr Val Asn Asn He Lys Cys Ser Gin Leu 
260 265 270 



Thr Ala Asn* Leu Asn Asn Gly' Phe Tyr Pro Val Ser Ser Ser Glu Val 
275 280 285 



Gly Ser Val Asn Lys Ser Val Val Leu Leu Pro Ser Phe Leu Thr His 
290 ' 295 300 



Thr He Val Asn He Thr He Gly Leu Gly Met Lys Arg Ser Gly Tyr 
305 310 315 320 



Gly Gin Pro He Ala Ser Thr Leu Ser Asn He Thr Leu Pro Met Gin 
325 330 335 
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Asp Asn Asn Thr Asp Va.l Tyr Cys Val Arg .Ser Asp Gin Phe Ser Val 
340 345 350 



Tyr Val His Ser Thr Cys' Lys Ser Ala Leu Trp Asp Asn Val Phe Lys 
355 360 365 



Arg Asn Cys Thr Asp Val Leu Asp Ala Thr Ala Val He Lys Thr Gly 
370 375 380 , 



Thr Cys Pro Phe Ser Phe Asp Lys Leu Asn Asn Tyr. Lea Thr Phe Asn 
385 . 390 395 400 



Lys Phe Cys Leu Ser' Leu Ser Pro Val Gly Ala Asn Cys Lys Phe Asp 
405 410 415" 



Val Ala Ala Arg Thr Arg Thr Asn Glu Gin Val Val Arg Ser Leu Tyr 
. 420 425- ,430 



Val He Tyr Glu Glu Gly Asp Ser He Val Gly Val Pro Ser Asp Asn. 
435 440 . 445 . 



Ser Gly Leu His Asp.- Leu Ser Val Leu His Leu Asp Ser Cys Thr Asp 
'650 455 4 50 • 



Tyr Asn He Tyr Gly Arg Thr Gly Val Gly He He Arg Gin Thr Asn 
465 470 475 480 



^rg Thr Leu Leu Ser Gly Leu Tyr Tyr Thr Ser Leu Ser 'Gly Asp Leu 
485 490 495 . 



Leu Gly Phe Lys Asn Val. Ser Asp Gly Val He Tyr Ser Val Thr Pro 
500 ■ 505 510 • 



Cys Asp Val Ser Ala Gin Ala Ala Val He Asp Gly Thr He Val Gly 
515 520 525 



Ala He Thr Ser He Asn Ser Glu Leu Leu Gly Leu Thr His Trp Thr 
530 535 540 



He Thr Pro Asn Phe Tyr Tyr Tyr Ser He Tyr Asn Tyr Thr Asn Asp 
545 550 555 560 



Lys Thr Arg Gly Thr Pro He Asp Ser Asn Asp Val Gly Cys Glu Pro 
565 570 575 
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Val lie Thr Tyr Ser Asn He Gly Val Cys Lys Asn Gly Ala Leu Val 
580 - ' 585 . 590 . 



Phe He- Asn Val Thr His Ser Asp Gly Asp Val Gin Pro * He Ser Thr 
595 , 600 605 



Gly Asn Val Thr He Pro Thr Asn Phe Thr He Ser Val Gin Val Glu 
610 615 620 



Tyr He Gin Val Tyr Thr Thr Pro Val Ser He Asp Cys Ser Arg Tyr 
625 630 .635 640 



Val Cys Asn Gly Asn Pro Arg- Cys Asn Lys Leu Leu Thr Gin Tyr Val 
645 650 655 



Ser Ala Cys Gin Thr He Glu Gin Ala Leu Ala Met Gly Ala Arg Leu 
660 665 670 



Glu Asn Met Glu Val Asp Ser Met Leu Phe Val Ser Glu Asn Ala Leu 
675 680 685 



Lys Leu Ala Ser Val Glu Ala Phe Asn Ser Ser Glu Thr Leu Asp Pro 
690 695 » 700 



He- Tyr Thr Gin Trp Pro Asn He Gly Gly Phe Trp Leu Glu Gly Leu 
705 710 = 715 720 



Lys Tyr lie Leu Pro Ser Asp Asn Ser • Lys Arg Lys Tyr Arg Ser Ala 
725 730 735 



He Glu Asp Leu Leu Phe Ser Lys Val Val Thr Ser Gly Leu Gly Thr 
740 745 750 



Val Asp Glu Asp Tyr Lys Arg Cys Thr Gly Gly Tyr Asp He Ala Asp 
755 760 765 



Leu Val Cys Ala Gin Tyr Tyr Asn Gly He Met Val Leu Pro Gly Val 
770 775 780 



Ala Asn Ala Asp Lys Met Thr Met Tyr Thr Ala Ser Leu Ala Gly Gly 
785 790 795 800 



He Thr Leu Gly Ala Phe Gly Gly Gly Ala Val Ser He Pro Phe Ala 
805 810 815 
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Val Ala Val Gin Ala Arg Leu As'n Tyr Val Ala Leu Gin Thr Asp Val 
• 820 • ■ 825 ' 830. 



Leu Asn Lys Asn Gin Gin lie Leu Ala Ser Ala Phe Asn '.Gln Ala He 
835 840 845 



Gly Asn He Thr Gin Ser Phe Gly Lys Val Asn Asp Ala He His Gin 
850 - 855 860 



Thr Ser Arg Gly Leu Thr Thr Val Ala Lys Ala Leu Ala Lys Val . Gin 
8.65 . 870- 875 880 



Asp Val Val Asn Thr Gin Gly Gin Ala Leu Arg His Leu Thr Val Gin 

885 890 895 



Leu Gin Asn Asn Phe Gin Ala He Ser Ser Ser He Ser Asp He Tyr 
900 905 910 



Asn Arg Leu Asp Glu i^eu Ser Ala Asp Ala Gin Val Asp Arg Leu He 
915 920 925 



Thr Gly Arg Leu Thr Ala Leu Asn Ala »Phe Val Ser Gin Thr Leu Thr * 
930 935 . 940 

V 

Arg Gin Ala Glu Val Arg Ala Ser Arg Gin- Leu Ala Lys Asp Lys Val 
945 950 955 960 



Asn Glu Cys Val Arg Ser Gin Ser Gin Arg Phe Gly Phe Cys Gly Asn- 
9-65 970 975 



Gly Thr His Leu Phe Ser Leu Ala Asn Ala Ala Pro Asn Gly Met He 
980 985 990 



Phe- Phe His Thr Val Leu Leu Pro Thr Ala Tyr Glu Thr Val Thr Ala 
995 1000 1005 



Trp Ser Gly He Cys Ala Leu Asp Gly Asp Arg Thr Phe Gly Leu • 
1010 1015 1020 



Val Val Lys Asp Val Gin Leu Thr Leu Phe Arg Asn Leu Asp Asp 
■ 1025 1030 1035 



Lys Phe Tyr Leu Thr Pro Arg Thr Met Tyr Gin Pro Arg Val Ala 
1040 1045 1050 



.Thr Ser Ser Asp Phe Val Gin He Glu Gly Cys Asp Val Leu Phe 
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1055 1060 .1065 



Val Asn Thr Thr Val Ser Asp Leu Pro Ser lie He Pro Asp Tyr 
1070 1075 1080 



He Asp He Asn Gin Thr Val Gin Asp He Leu Glu Asn Phe Arg 
1085 ' 1090 1095 



Pro Asn Trp Thr Val Pro Glu Leu Thr Leu Asp Val Phe Asn Ala 
1100 1105 . 1110 



Thr Tyr Leu Asn Leu Thr Gly Glu He Asp Asp Leu Glu Phe Arg- 
1115 . 1120 1125 



Ser Glu Lys Leu His Ash Thr Thr Val Glu Leu Ala He Leu He 
1130 1135 1140 



Asp Asn He Asn Asn Thr Leu Val Asn Leu Glu Trp Leu Asn Arg 
1145 1150 1155 



He Glu Thr Tyr Val Lys Trp Pro Trp Tyr Val. Trp Leu Leu He 
1160 1165 1170 

-. ' • 1 
Gly Leu Val Val He Phe Cys He Pro Leu Leu Leu Phe Cys Cys 
• 1175 1180 1185 



Cys Ser Thr Gly Cys Cys Gly Cys He Gly Cys Leu Gly Ser Cys 
1190 • 1195 1200 



Cys His Ser He Phe Ser Arg Arg Gin Phe Glu Asn Tyr Glu Pro 
1205 1210 1215 



He Glu Lys Val His Val His 
1220 1225 



<210> 62 " 

<211> 82 

<212> PRT 

<213> Porcine transmissible gastroenteritis . coronoavirus 

<400> 62 

Met Thr Phe Pro Arg Ala Leu Thr Val He Asp Asp Asn Gly Met Val 
1 5 10 - 15 



He Asn He lie Phe Trp Phe Leu Leu He He He Leu He Leu Leu 
20 25 . 30 ' 
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Ser lie Ala Leu Leu Asn lie He Lys Leu Cys Met Vai Cys Cys Asn 
35 .40 • • .45 



Leu Gly. Arg Thr Val He He Val Pro Ala Gin His Ala * Tyr Asp Ala 
50 .'55 60 



Tyr Lys Asn Phe -Met Arg He Lys Ala Tyr Asn Pro Asp Gly Ala Leu 
65 70 75 80 



Leu Ala 



<210> 63 . ' 

<211> 4 37 6 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 63 ' 

Met Glu Ser Leu Val Leu Gly Val Asn Glu Lys Thr His Val Gin Leu 
1 5 10 ■ 15 ' 



Ser Leu Pro Val Leu Gin Val Arg Asp Val- Leu Val Arg Gly Phe Gly 

20 ■ 25 ' 30 



Asp Ser Val Glu Glu Ala Leu Ser Glu Ala Arg Glu His Leu Lys Asn 
35 40- 45 



Gly Thr Cys Gly Leu Val Glu Leu Glu Lys Sly Val Leu Pro Gin Leu 
50 55 60 



Glu Gin Pro Tyr Val Phe He Lys Arg Ser Asp Ala Leu Ser Thr Asn 
65 70 75 80 



His Gly His Lys Val Val Glu Leu Val Ala Glu Met Asp Gly He Gin 
85 '90 95 



Tyr Gly Arg Ser Gly He Thr Leu Gly Val Leu Val Pro His Val Gly 
100 105 110 



Glu Thr Pro He Ala Tyr Arg Asn Val Leu Leu Arg Lys Asn Gly Asn 
115 120 125 



Lys Gly Ala Gly Gly His Ser Tyr Gly He Asp Leu Lys Ser Tyr Asp 
130 135 140 



Leu Gly Asp Glu Leu Gly Thr Asp Pro He Glu Asp Tyr Glu Gin Asn 
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145 150 155 160 



Trp Asn Thr Lys His Gly Ser Gly Ala Leu Arg Glu Leu Thr Arg Glu. 

165 170 175 



Leu Asn Gly Gly Ala Val Thr Arg Tyr Val Asp Asn Asn Phe Cys Gly 
180 185 190 



Pro Asp Gly Tyr Pro Leu Asp Cys lie Lys Asp Phe Leu Ala Arg Ala 

195 200 205 



Gly Lys Ser Met Cys Thr Leu Ser Glu Gin Leu Asp Tyr He Glu Ser 
210 215 220 



Lys Arg Gly Val Tyr Cys Cys Arg Asp His Glu His Glu He Ala Trp 
225 230 235 240 



Phe Thr Glu Arg Ser Asp Ly's Ser Tyr Glu His Gin Thr. Pro Phe Glu' 
245 250 255 



He Lys Ser Ala Lys Lys Phe Asp Thr Phe Lys Gly Glu Cys Pro Lys 
260 265 ' 270 



Phe Val Phe Pro Leu Asn Ser Lys Val Lys Val He Gin Pro Arg Val 
275 280 285 



Glu Lys Lys Lys Thr Glu Gly Phe Met Gly Arg He Arg Ser Val Tyr 
290 295 300 



Pro Val Ala Ser Pro Gin Glu Cys Asn Asn Met His Leu Ser Thr Leu 
305 310 315 320 



Met Lys Cys Asn His Cys Asp Glu Val Ser Trp Gin Thr Cys Asp Phe 
325 330 335 



Leu Lys Ala Thr Cys Glu His Cys' Gly Thr Glu Asn Leu Val He Glu 
340 345 350 



Gly Pro Thr Thr Cys Gly Tyr Leu Pro Thr Asn Ala Val Val Lys Met 
355 360 365 



Pro Cys Pro Ala Cys Gin Asp Pro Glu He Gly Pro Glu His Ser Val 
370 375 380 



Ala Asp Tyr His Asn His Ser Asn He Glu Thr Arg Leu Arg Lys Gly 
385 390 395 400 



145 



wo 2004/096842 



PCT/CA2004/000626 



Gly Arg Thr Arg Cys Phe ,Gly Gly Cys Val Phe Ala.Tyr Val .Gly Cys 
405 410 415 



Tyr Asn Lys Arg Ala Tyr Trp Val Pro Arg Ala Ser Ala Asp He Gly 
420 • 425 430 • 



Ser Gly His Thr Gly He Thr Gly Asp Asn Val Glu Thr Leu Asn Glu 
435 440 445 



Asp Leu Leu Glu He Leu Ser Arg Glu Arg Val Asn He Asn He Val 
450 455 460 



Gly Asp Phe His Leu Asn Glu Glu Val Ala He He Leu Ala Ser Phe 
465 470' . 475 480 



Ser Ala Ser Thr Ser Ala Phe He Asp, Thr He Lys Ser Leu Asp Tyr 
485 490 495 



Lys Ser Phe Lys Thr He Val Glu Ser Cys Gly Asn Tyr Lys Val Thr 
' 500 505 510 



Lys -Gly Lys Pro Val Lys Gly Ala Trp Asn He Gly Gin Gin Arg Ser 
515 ' ' - 520 525 



Val Leu -Thr Pro Leu Cys Gly Phe Pro Ser Gin Ala Ala Gly Val He 
530 535 540 



Arg Ser He Phe Ala Arg Thr Leu Asp Ala Ala Asn His Ser He Pro 
-545 550 555 560 



Asp Leu Gin Arg Ala Ala Val Thr He Leu Asp Gly He Ser Glu Gin 
565 570 575 



Ser Leu Arg Leu Val Asp Ala Met Val. Tyr Thr Ser Asp Leu Leu Thr 
580 585 590 



Asn Ser Val He He Met Ala Tyr Val Thr Gly Gly Leu Val Gin Gin 
595 600 605 



Thr Ser Gin Trp Leu Ser Asn Leu Leu Gly Thr Thr Val Glu Lys Leu 
610 615 620 



Arg Pro He Phe Glu Trp He Glu Ala Lys Leu Ser Ala Gly Val Glu 
625 630 635 640 



146 



wo 2004/096842 



PCT/CA2004/000626 



Phe Leu Lys Asp Ala Trp "Glu lie Leu Lys Phe Leu lie Thr Gly Val 
645 '■. . 650 . 655 

Phe Asp He Val Lys Gly Gin He Gin Val Ala Ser Asp Asn He Lys 
* 660 .665 670 

Asp Cys Val Ijys Cys Phe He Asp Val Val Asn Lys Ala Leu Glu Met 
675 680 685 ■ 

Cys He Asp Gin Val. Thr lie Ala Gly Ala Lys Leu Arg Ser Leu Asn 
690 695 700. 



Leu Gly Glu Val Phe He Ala Gin Ser Lys Gly Leu Tyr Arg Gin Cys 
705' . 710 715 .720 



He Arg Gly Lys Glu Gin Leu Gin Leu Leu. Met Pro Leu Lys Ala .Pro 
725 730 735 



Lys Glu Val Thr Phe Leu Glu Gly Asp Ser His Asp Thr Val Leu Thr 
740 745 . 750 • 



Ser Glu Glu Val Val Leu Lys Asn Gly Glu Leu Glu Ala Leu Glu Thr 
755 760 '765 



Pro Val Asp Ser Phe Thr Asn- Gly Ala He Val Gly Thr. Pro* Val Cys 
770 -775 .780 



Val Asn Gly Leu Met Leu Leu Glu He Lys Asp Lys Glu Gin Tyr Cys 
785' " 790 795 800- 



Ala Leu Ser Pro Gly Leu Leu Ala Thr Asn Asn Val Phe Arg Leu Lys 
805 810 815 



Gly Gly Ala Pro He Lys Gly Val Thr Phe* Gly Glu Asp Thr Val Trp 
820 825 830 



Glu Val Gin Gly Tyr Lys Asn Val Arg He Thr Phe Glu Leu Asp Glu 
835 840 845 



Arg Val Asp Lys Val Leu Asn Glu Lys Cys Ser Val Tyr Thr Val Glu 
850 855 860 



Ser Gly Thr Glu Val Thr Glu Phe Ala Cys Val Val Ala Glu Ala Val 
865 87Q 875 880 
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Val Lys Thr Leu Gin Pro Val Ser Asp Leu Leu Thr Asn Met Gly lie 
. • 885 . 890 895 



Asp Leu Asp. Glu Trp Ser Val Ala Thr .Phe Tyr Leu Phe Asp Asp Ala 
. ' 900 ■ 905 ' ' • -910 



Gly Glu Glu Asn Phe Ser Ser Arg Met Tyr Cys Ser Phe Tyr Pro Pro 
• 915 920 925 



Asp Glu Glu Glu Glii Asp Asp Ala Glu Cys Glu Glu Glu Glu lie Asp 
930 935 940 



Glu Thr Cys Glu His Glu Tyr Gly Thr Glu Asp Asp Tyr Gin Gly Leu 
945 '950. • 955 960 



Pro Leu Glu Phe Gly Ala Ser Ala Glu Thr Val Arg Val Glu Glu Glu 
965 970 975 



Glu Glu Glu Asp Trp Leu Asp Asp Thr Thr Glu Gin Ser Glu He Glu 
980 . 985 990 



Pro Glu Pro Glu Pro Thr Pro Glu Glu Pro Val Asn Gin Phe Thr Gly 
995 1000 • 1005 



Tyr Leu Lys Leu Thr Asp Asn Val Ala He Lys Cys Val Asp He 
1010 1015 1020 



Val Lys Glu Ala Gin Ser Ala Asn Pro Met Val He Val Asn Ala 
1025 1030 1035 



Ala Asn He His Leu Lys His Gly Gly Gly Val Ala Gly Ala Leu 
1040 1045 1050 



Asn Lys Ala Thr Asn Gly Ala Met Gin Lys Glu Ser Asp Asp Tyr 
1055 1060 1065 



He Lys Leu Asn Gly Pro Leu Thr Val Gly Gly Ser Cys Leu Leu 
1070 1075' 1080 



Ser Gly His Asn Leu Ala Lys Lys Cys Leu His Val Val Gly Pro 
1085 1090 1095 



Asn Leu Asn Ala Gly Glu Asp He Gin Leu Leu Lys Ala Ala Tyr 
1100 1105 1110 



Glu Asn Phe Asn Ser Gin Asp He Leu Leu Ala Pro Leu Leu Ser 
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1115 1120 1125 



Ala Gly lie Phe Gly Ala Lys Pro Leu Gin Ser Leu Gin Val Cys 
1130 1135 1140 



Val Gin Thr Val Arg Thr Gin Val Tyr lie Ala Val Asn Asp Lys 
1145 1150 1155 



Ala Leu . Tyr Glu Gin Val Val Met Asp Tyr Leu Asp Asn Leu Lys 
1160 1165 1170 



Pro Arg Val Glu Ala Pro Lys Gin Glu Glu Pro Pro Asn Thr. Glu 
1175 1180 1185 



Asp Ser Lys Thr Glu Glu Lys- * Ser Val Val Gin Lys Pro Val Asp 
1190 1195 1200 ' 



Val Lys Pro Lys lie Lys Ala Cys He Asp Glu Val Thr Thr Thr 
1205 1210 . 1215 . 



Leu Glu Glu Thr Lys Phe Leu • Thr Asn Lys Leu Leu ■ Leu Phe Ala 
• 1220 1225 ' 1230 



Asp lie Asn Gly Lys lieu 'Tyr His Asp Ser Gin Asn Met Leu Arg 
1235* 1240 1245 



Gly Glu Asp Met Ser Phe Leu Glu Lys Asp Ala Pro Tyr Met Val 
1250 1255 1260 



Gly Asp Val He Thr Ser Gly Asp He Thr Cys Val Val He Pro 
•1265 1270 1275 



Ser Lys Lys Ala Gly Gly Thr Thr 'Glu Met Leu Ser Arg Ala Leu 
. 1280 1285 1290 



Lys Lys Val Pro Val* Asp Glu Tyr lie Thr Thr Tyr "Pro Gly Gin 
1295 1300 1305 



Gly Cys Ala Gly Tyr Thr Leu Glu Glu Ala Lys Thr Ala Leu Lys 
1310 1315 1320 



Lys Cys Lys Ser Ala Phe Tyr Val Leu Pro Ser Glu Ala Pro Asn 
1325 1330 1335 



Ala Lys Glu Glu He Leu Gly Thr Val Ser Trp Asn Leu Arg Glu 
1340 1345 1350 
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Met Leu Ala His Ala Qlu Glu Thr Arg Lys Leu Met Pro lie Cys 
1355 ■ 1360 1365 



Met Asp Val Arg Ala He Met Ala Thr He Gin Arg Lys Tyr Lys 
1370 1375 1380 



Gly He Lys He Gin Glu Gly He Val Asp Tyr Gly Val" Arg Phe 
1385 1390 1395 



Phe Phe Tyr Thr Ser Lys Glu Pro Val Ala Ser He He Thr Lys 
,1400 1405 1410 



Leu Asn Ser Leu Asn Glu Pro Leu Val Thr Met Pro He Gly Tyr 
1415 1420 .1425 



Val Thr His Gly Phe Asn Leu Glu Glu Ala Ala Arg Cys Met Arg 
, 1430 1435 1-440 . 



Ser Leu Lys Ala Pro Ala Val Val Ser Val Ser Ser Pro Asp Ala 
1445 1450 1455 



Val Thr Thr Tyr Asn Gly Tyr Leu Thr Ser Ser Ser Lys Thr Ser 
;l'460 1465 1470 



Glu Glu His Phe Val Glu Thr Val Ser Leu Ala Gly Ser Tyr Arg 
1475 1480 1485 



Trp Ser Tyr Ser Gly Gin Arg Thr Glu Leu Gly Val Glu Phe 
1490 1495 1500 



Leu Lys Arg Gly Asp .Lys He Val Tyr His Thr Leu • Glu Ser Pro 
1505 1510 1515 



Val Glu Phe His Leu Asp Gly Glu Val Leu Ser Leu Asp Lys Leu 
1520 1525 1530 



Lys Ser Leu Leu Ser Leu Arg Glu Val Lys Thr He Lys Val Phe 
1535 • 1540 1545 , 



Thr Thr Val Asp Asn Thr Asn Leu His Thr Gin Leu Val Asp Met 
1550 1555 1560 



Ser Met Thr Tyr Gly Gin Gin Phe Gly Pro Thr Tyr Leu Asp Gly 
1565 1570 1575 
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Ala Asp Val Thr Lys lie' Lys Pro -His Val Asn His Gin Gly Lys 
1580 1585 1590 



Thr Phe Phe Val Leu Pro Ser Asp Asp Thr Leu Arg Ser Glu Ala 
1595 1600 1605 



Phe Glu Tyr Tyr His Thr. Leu Asp Glu* Ser Phe Leu ■ Gly Arg Tyr 
1610 1615 1620 • • 



Met Ser Ala Leu Asn His Thr Lys Lys Trp Lys Phe Pro Gin Val 
1625 1630 1635 



Gly Gly Leu Thr Ser He Lys Trp Ala Asp Asn Asn Cys Tyr Leu 
1640 1645 1650 



Ser Ser Val Leu Leu Ala Leu Gin Gin Leu Glu Val . Lys Phe Asn 
1655 1660 ^1665 



Ala Pro Ala Leu Gin Glu Ala Tyr Tyr Arg Ala Arg Ala Gly Asp 
1670 1675 . 1680. 



Ala Ala Asn Phe . Cys Ala Leu He Leu Ala Tyr* Ser Asn Lys Thr 
1685 1690 1695 



Val Gly Glu Leu Gly Asp Val Arg Glu Thr Met- Thr His Leu Leu 
1700 • 1705 1710 



Gin His Ala Asn Leu Glu Ser Ala Lys Arg Val Leu Asn Val Val 
1715 1720 1725 



Cys Lys His Cys Gly Gin Lys Thr Thr Thr Leu Thr Gly Val Glu 
1730 1735 1740 



Ala Val Met Tyr Met Gly Thr Leu Ser Tyr Asp Asn Leu Lys Thr 
1745 1750 1755 



Gly Val Ser He Pro Cys Val Cys Gly Arg Asp Ala Thr Gin Tyr 
1760 1765 1770 



Leu Val Gin Gin Glu Ser Ser Phe Val Met Met Ser Ala Pro Pro 
1775 1780 1785 



Ala Glu Tyr Lys Leu Gin Gin Gly Thr Phe Leu Cys Ala Asn Glu 
1790 1795 1800 
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Tyr Thr Gly Asn Tyr Gin Cys Gly His Tyr Thr - His He Thr Ala 
1805 , • -1810 1815' 



Lys Glu Thr Leu Tyr Arg He Asp Gly Ala His Leu Thr Lys Met 
1820 1825" 1830 



Ser Glu Tyr Lys Gly Pro Val Thr Asp Val Phe Tyr Lys Glu Thr 
1835 1840 1845 



Ser Tyr Thr Thr Thr He Lys Pro Val Ser Tyr Lys Leu Asp Gly 
1850 1855 1860 



Val Thr Tyr Thr Glu He Glu Pro Lys Leu Asp Gly Tyr Tyr Lys 
1865 18'70 1875 



Lys Asp Asn Ala Tyr Tyr Thr Glu Gin Pro He Asp Leu Val Pro 
1880 1885 • 1890 



Thr Gin Pro Leu Pro Asn Ala Ser Phe Asp Asn E?he Lys Leu Thr 
1895 1900 1905 



Cys Ser Asn Thr Lys Phe Ala Asp -Asp Leu Asn Gin Met Thr Gly 
1910 1915 1920 



Phe Thr - Lys Pro Ala Ser Arg Glu Leu Ser Val Thr Phe Phe Pro 
1925 1930 1935 



Asp Leu Asn Gly Asp Val Val Ala He Asp Tyr Arg His Tyr Ser 
1940 1945 1950 



Ala Ser Phe Lys Lys Gly Ala Lys Leu Leu His Lys Pro He Val 
1955 1960 , 1965 



Trp His He Asn Gin Ala Thr Thr Lys - Thr Thr Phe Lys Pro Asn 
1970 1975 1980 



Thr Trp Cys Leu Arg Cys Leu Trp Ser Thr Lys Pro Val Asp Thr' 
1985 1990 1995 



Ser Asn Ser Phe Glu Val Leu Ala Val Glu Asp Thr Gin Gly Met 
2000 2005 2010 



Asp Asn Leu Ala Cys Glu Ser Gin Gin Pro Thr Ser Glu Glu Val 
2015 2020 2025 



Val Glu Asn Pro Thr He Gin Lys Glu Val He Glu Cys Asp Val 
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,2030 2035 2040 



Lys Thr Thr Glu Val Val Gly Asn Val lie Leu Lys Pro Ser Asp . 
2045 2050 2055 



Glu Gly Val Lys Val Thr Gin Glu Leu Gly His Glu Asp Leu Met 
2060 2065 . 2070 



Ala Ala . Tyr Val Glu- Asn Thr Ser lie Th.r He Lys Lys Pro Asn 
2075 2080 2085 



Glu Leu Ser Leu Ala Leu Gly Leu Lys Thr He Ala Thr' His. Gly 
2090 2095 . 2100 



He Ala Ala He Asn Ser Val ' Pro Trp Ser Lys He Leu Ala Tyr 
2105 2110 2115 



Val Lys Pro' Phe Leu Gly Gin Ala Ala He Thr Thr Ser Asn Cys 
2120 2125 2130 



Ala Lys Arg Leu Ala Gin Arg Val Phe Asn Asn Tyr Met Pro Tyr 
' 2135 2140 • 2145 



Val phe Thr Leu Leu Phe Gin Leu Cys Thr Phe Thr 'Lys Ser Thr 
2150 2155 2160 

} 

f 

Asn Ser Arg He Arg Ala Ser Leu Pro Thr' Thr He Ala Lys Asn 
2165 2170 2175 



Ser Val Lys Ser Val Ala Lys Leu Cys Leu Asp Ala Gly He Asn 
2180 2185 2190 



Tyr Val Lys Ser Pro Lys Phe Ser Lys Leu Phe Thr He Ala Met 
2195 2200 2205 



Trp Leu Leu Leu Leu Ser He Cys Leu Gly Ser Leu He Cys Val 
2210 2215 2220 



Thr Ala Ala Phe Gly Val Leu Leu Ser Asn Phe Gly Ala Pro Ser 
2225 2230 2235 



Tyr Cys Asn Gly Val Arg Glu Leu Tyr Leu Asn Ser Ser Asn Val 
2240 2245 2250 



Thr Thr Met Asp Phe Cys Glu Gly Ser Phe Pro Cys Ser He Cys 
2255 2260 2265 
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Leu Ser Gly Leu Asp Ser. Leu Asp Ser Tyr Pro Ala Leu Glu Thr 
227.0 2275 2280 



lie Gin Val Thr ,ile Ser Ser Tyr Lys Leu Asp Leu Thr lie Leu 
2285 . 2290 2295 



Gly Leu Ala Ala Glu Trp Val Leu Ala Tyr Met Leu Phe Thr Lys 
2300 2305 2310 



Phe Phe Tyr Leu Leu Gly Leu Ser Ala lie Met Gin Val Phe Phe 
2315 2320 2325 



Gly Tyr Phe Ala Ser His Phe He Ser Asn Ser Trp Leu Met Trp 
2330 ' 2335 2340 



Phe He He Ser He. Val Gin Met Ala Pro Val Ser Ala Met Val 
2345 2350 2355 



Arg Met Tyr lie Phe Phe Ala Ser Phe Tyr Tyr He Trp Lys * Ser 
2360 2365 2370 



Tyr Val His He Met Asp, Gly Cys »Thr Ser Ser Thr Cys Met Met 
2315 ' 23B0 2385 



Cys Tyr Lys Arg Asn Arg Ala Thr Arg Val Glu Cys Thr Thr He 
2390 2395 2400 



Val Asn Gly Met Lys Arg Ser Phe Tyr Val Tyr Ala Asn Gly Gly 
2405 2410 2415 



Arg Gly Phe Cys Lys Thr His Asn Trp A^n Cys .Leu Asn Cys Asp 
2420 .2425 2430 



Thr Phe Cys Thr Gly Ser Thr ^ Phe He Ser Asp Glu Val Ala Arg* 
2435 24-40 2445 



Asp Leu Ser Leu Gin Phe Lys Arg Pro He Asn Pro Thr Asp Gin 
2450 2455 2460 



Ser Ser Tyr He Val Asp Ser Val Ala Val Lys Asn Gly Ala Leu 
2465 2470 2475 



His Leu Tyr Phe Asp Lys Ala Gly Gin Lys Thr Tyr Glu Arg His 
2480 2485 2490 . 
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Pro Leu Ser His Phe Val' Asn Leu Asp Asn Leu Arg Ala Asn Asn 
2495 2500 2505 



Thr Lys Gly Ser Leu Pro lie Asn Val lie Val Phe Asp Gly Lys 
2510 * 2515 2520 



Ser Lys Cys Asp Glu Ser .Ala Ser iys* Ser Ala Ser Val Tyr- Tyr 
2525 2530 2535 



Ser Gin Leu Met Cys Gin Pro lie Leu Leu Leu Asp Gin Ala Leu 
2540 2545 2550 



Val Ser Asp Val Gly Asp Ser Thr Glu Val Ser Val Lys Met Phe 
2555 2560 2565 



Asp Ala Tyr Val Asp Thr Phe Ser Ala Thr Phe Ser Val Pro Met 
2570 2575 2580 



Glu Lys Leu Lys Ala Leu Val Ala Thr Ala His Ser Glu Leu Ala 
2585 2590 2595. 



Lys Gly Val Ala Leu Asp Gly Val Leu Ser Thr' Phe Val Ser Ala 
2600 ■ * 2605 2610 



Ala Arg Gin Gly Val Val Asp Thr Asp Val Asp .Thr Lys Asp Val 
2615 . ■ 2620 2625 



lie Glu Cys Leu Lys Leu Ser His His Ser Asp Leu Glu Val Thr 
2630 2635 2640 



Gly Asp Ser Cys Asn Asn Phe Met Leu Thr Tyr Asn Lys Val Glu 
2645 . 2650 2655 



Asn Met Thr Pro Arg Asp Leu Gly Ala Cys lie Asp Cys Asn Ala 
2660 2665 2670 



Arg His He Asn Ala Gin Val Ala Lys Ser His Asn Val Ser Leu 
2675 2680 2685 



lie Trp Asn Val Lys Asp Tyr Met Ser Leu Ser Glu Gin Leu Arg 
2690 2695 2700 



Lys Gin He Arg Ser Ala Ala Lys Lys Asn Asn He Pro Phe Arg 
2705 2710 2715 
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Leu Thr Cyis Ala Thr Thr Arg Gin Val Val Asn-Val lie Thr .Thr 
2720 2725 . 2730 ' 



Lys lie Ser Leu Lys Gly Gly . Lys lie Val Ser Thr Cys Phe Lys 
2735 2740' 2745 ' 



Leu Met Leu Lys Ala Thr Leu Leu Cys Val Leu Ala Ala Leu Val 
2750 2755 2760 



Cys Tyr lie Val Met Pro Val His Thr Leu Ser He His Asp Gly 
2765 2770 2775 



Tyr Thr Asn Glu He He Gly Tyr Lys Ala He Gin Asp Gly Val 
2780 27-85 " 2790 



Thr Arg .Asp He He Ser Thr Asp Asp Cys Phe Ala Asn Lys His 
2795 2800 ■ 2805 



Ala Gly Phe Asp Ala Trp Phe Ser Gin Arg Gly Gly Ser Tyr Lys 
2810 2815 - 2B20 



Asn Asp Lys Ser Cys Pro Val Val Ala Ala He He Thr Arg Glu 
2825 2830 2835 



He Gly Phe He Val Pro Gly Leu Pro Gly Thr Val Leu Arg Ala 
2840 2845 2850 



ile Asn Gly Asp Phe Leu His Phe Leu Pro Arg Val Phe Ser Ala 
2855 2860 2865 



Val Gly Asn lie Cys Tyr Thr Pro Ser Lys Leu lie Glu Tyr Ser 
2870 2875 2880 



Asp Phe Ala Thr Ser Ala Cys Val Leu Ala Ala Glu Cys Thr He 
2885 2890- 2895 



Phe Lys Asp Ala Met Gly Lys Pro Val Pro Tyr Cys Tyr Asp Thr' 
2900 ' 2905 2910 



Asn Leu Leu Glu Gly Ser lie Ser Tyr Ser Glu Leu Arg Pro Asp 
2915 2920 2925 



Thr Arg Tyr Val Leu Met Asp Gly Ser He He Gin Phe Pro Asn 
2930 2935 2940 



Thr Tyr Leu Glu Gly Ser Val Arg Val Val Thr Thr Phe Asp Ala 
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2945 2950 -2955 

Glu Tyr Cys Arg His Gly Thr Cys Giu Arg Ser Giu Val Gly lie 
2960 2965 2970 

' Cys Leu Ser Thr Ser Gly Arg Trp Val Leu Asn Asn Glu Hi's Tyr 
2975 2980 2985 

Arg Ala Leu Ser Gly Val Phe Cys Gly Val Asp Ala Met Asn Leu 
2990 2995 3000 

Tie Ala Asn lie Phe Thr Pro Leu Val Gin Pro Val Gly Ala Leu 
3005 3010 3015 

Asp Val Ser Ala Ser Val Val Ala Gly Gly lie lie Ala He Leu 
3020 3025 3030' 



Val Thr Cys Ala Ala Tyr Tyr Phe Met Ly^ Phe Arg Arg Val Phe 
3035 3040 • 3045 

Gly Glu Tyr Asn His Val Val Ala Ala Asn Ala. Leu Leu Phe Leu 
3050 3055 3060 



Met Ser Phe Thr He Leu Cys Leu Val Pro Ala Tyr Ser Phe Leu 
3065 3070 3075 



Pro Gly Val Tyr Ser Val Phe Tyr Leu Tyr Leu Thr Phe Tyr Phe 
3080 . 3085 ' 3090 



Thr Asn Asp Val Ser Phe Leu Ala His Leu Gin Trp Phe Ala Met 
3095 3100 3105 



Phe Ser Pro He Val Pro Phe Trp He Thr Ala He Tyr Val Phe 
3110 3115 3120 



Cys He Ser Leu Lys His Cys His Trp Phe Phe Asn Asn Tyr Leu 
3125 3130 3135 



Arg Lys Arg Val Met Phe Asn Gly Val Thr Phe Ser Thr Phe Glu 
3140 3145 3150 



Glu Ala Ala Leu Cys Thr Phe Leu Leu Asn Lys Glu Met Tyr Leu 
3155 3160 3165 



Lys Leu Arg Ser Glu Thr Leu Leu Pro Leu Thr Gin Tyr Asn Arg 
3170 3175 3180 
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Tyr Leu Ala Leu Tyr Asn Lys .Tyr Lys Tyr Phe Ser Gly Ala Leu 
3185 3190 ■ 3195 



Asp Thr Thr Ser Tyr Arg Glu Ala Ala Cys Cys His L©u Ala Lys 
3200 3205 3210 



Ala Leu Asn Asp Phe Ser Asn Ser Gly Ala Asp Val Leu • Tyr Gin 
3215 3220 3225 



Pro Pro Gin Thr Ser He Thr Ser Ala Val Leu Gin Ser Gly Phe 
3230 3235 3240 



Arg Lys Met Ala Phe Pro Ser Gly Lys Val Glu Gly Cys Met Val 
3245 3250 3255 



Gin Val Thr Cys Gly Thr Thr Thr Leu Asn Gly Leu Trp Leu Asp 
3260 3265 3270 



Asp Thr Val Tyr Cys Pro Arg His Val He Cys Thr Ala Glu Asp 
3275 3280 3285 



Met' Leu Asn Pro Asn Tyr Glu Asp Leu Leu He Arg Lys Ser Asn 
3290 3295 3300 ' 



His Ser Phe Leu Val Gin Ala Gly Asn Val Gin Leu Arg Val He 
3305 3310 3315 



Gly His Ser Met Gin Asn Cys Leu Leu Arg Leu Lys Val Asp Thr 
3320 3325 3330 



Ser Asn Pro Lys Thr Pro Lys Tyr Lys Phe Val Arg He Gin Pro 
3335 3340 3345 



Gly Gin Thr Phe Ser Val Leu Ala Cys Tyr Asn Gly Ser Pro Ser 
3350 3355 3360 . 



Gly Val Tyr Gin Cys Ala Met Arg Pro Asn His Thr He Lys Gly 
3365 3370 3375 



Ser Phe Leu Asn Gly Ser Cys Gly Ser Val Gly Phe Asn He Asp 
3380 3385 3390 



Tyr Asp Cys Val Ser Phe Cys Tyr Met His His Met Gin Leu Pro 
3395 3400 3405 
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Thr Gly ' Val His Ala Gly Thr Asp Leu Glu Gly .Lys Phe Tyr Gly 
3410 . 3415 • 3420 



Pro Phe. Val Asp Arg Gin Thr 'Ala Gin Ala Ala Gly Thr Asp Thr 
3425 . 3430 3435 



Thr lie Thr Leu Asn Val Leu Ala .Trp Leu Tyr Ala Ala Val lie 
3440 3445 3450 



Asn Gly Asp Arg Trp Phe Leu Asn Arg Phe Thr Thr Thr Leu Asn 
' 3455 3460 3465 



Asp Phe Asn Leu Val Ala Met Lys Tyr Asn Tyr Glu Pro Leu Thr 
3470 3475 3480 



Gin Asp His Val Asp He Leu Gly Pro Leu Ser- Ala Gin Thr Gly 
3485 3490 ' 3495 



He Ala Val Leu Asp Met Cys Ala Ala Leu Lys Glu Leu Leu Gin 
3500 3505 3510 



Asn Gly Met Asn Gly Arg Thr He Leu Gly "Ser Thr He Leu Glu 
3515 3520 » 3525 



Asp Glu Phe Thr Pro Phe Asp Val Val Arg Gin Cys Ser Gly Val 
3530 3535 3540 ' 



Thr Phe Gin Gly Lys Phe Lys Lys He Val Lys Gly Thr His His 
354,5 3550 3555 



Trp Met Leu Leu Thr Phe Leu Thr Ser Leu Leu He Leu Val Gin 
3560 3565 3570 



Ser Thr Gin Trp Ser Leu Phe Phe Phe Val Tyr Glu Asn Ala Phe 
3575 3580 3585 



Leu Pro Phe Thr Leu Gly He ' Met Ala He Ala Ala Cys Ala Met 
3590 3595 3600 



Leu Leu Val Lys His Lys His Ala Phe Leu Cys Leu Phe Leu Leu 
3605 3610 3615 



Pro Ser Leu Ala Thr Val Ala Tyr Phe Asn Met Val Tyr Met Pro 
3620 3625 3630 
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Ala Ser Trp Val Met Arg He Met Thr Trp Leu Glu Leu Ala Asp 
3635 3640 3645' 

Thr Ser Leu Ser Gly Tyr Arg Leu Lys, Asp Cys Val Met Tyr Ala 
3650 3655 3660 

Ser Ala Leu Val LeU Leu He Leu Met Thr Ala Arg Thr Val Tyr ' 
3665 .3670 3675 

Asp Asp Ala Ala Arg Arg Val Trp Thr Leu Met Asn Val He Thr 
3680 3685 3690 

Leu Val Tyr Lys Val Tyr Tyr Gly Asn Ala Leu Asp Gin Ala lie 
3695 3700 ■ 3705 

Ser Met Trp Ala Leu Val He Ser Val Thr Ser Asn Tyr Ser .Gly 
3710 3715 3720 

Val Val Thr Thr lie Met Phe ' Leu Ala Arg Ala He Val Phe Val • 
3725 3730 3735 

Cys Val Glu Tyr Tyr Pro Leu Leu Phe He Thr Gly Asn Thr Leu ♦ 
3740 3745 3750 

Gin Cys He Met Leu Val Tyr Cys Phe Leu Gly Tyr Cys Cys Cys 
3755 3760 .3765 

Cys Tyr Phe Gly Leu Phe Cys Leu Leu Asn Arg Tyr PUe Arg Leu * • 
3770 3775 3780 

Thr Leu Gly Val Tyr Asp Tyr Leu Val Ser Thr Gin Glu Phe Arg 
3785 3790 3795 

Tyr Met Asn Ser Gin Gly Leu Leu Pro Pro Lys Ser Ser He Asp 

3800 3805 ■ 3810 

Ala Phe Lys Leu Asn. He Lys Leu Leu Gly .He Gly Gly Lys Pro 
3815 3820 3825 

Cys He Lys Val Ala Thr Val Gin Ser Lys Met Ser Asp Val Lys 
3830 3835 3840 

Cys Thr Ser Val Val Leu Leu Ser Val Leu Gin Gin Leu Arg Val 
3845 3850 3855 



Glu Ser Ser Ser Lys Leu Trp Ala Gin Cys Val Gin Leu His Asn 
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3860 3865 .3870 

Asp He Leu Leu Ala Lys Asp Thr Thr Glu Ala Phe Glu Lys Met 

3875 3880 3885 

Val Ser Leu Leu Ser Val Leu Leu Ser .Met Gin Gly Ala Val Asp 

3890 3895 3900 

lie Asn Arg Leu Cys Glu Glu Met Leu Asp Asn Arg Ala Thr Leu 

3905 3910 3915 * 

Gin Ala He Ala Ser Glu Phe Ser Ser Leu Pro Ser Tyr Ala Ala 

3920 3925 3930 

Tyr Ala Thr Ala Gin Glu Ala Tyr Glu Gin Ala Val Ala Asn Gly 

3935. 3940 3945 

Asp Ser Glu Val Val Leu Lys Lys Leu Lys Lys Ser Leu Asn Val 

3950 3955 • 3960 

Ala Lys Ser Glu Phe Asp Arg Asp Ala Ala Met Gin Arg Lys . Leu 

3965 3970 3975 



Gl^a Lys Vi&t Ala Asp Gin Ala ]j?iet Thr Gin l>5et Tyr l^ys Gin Ala 
3980 3985 3990 



Arg Ser Glu Asp Lys Arg Ala Lys Val Thr Ser Ala Met Gin Thr 
3995 4000 4005 



Met Leu Phe Thr Met Leu Arg Lys Leu Asp Asn Asp Ala Leu Asn 
4010 4015 4020 



Asn He He Asn Asn Ala Arg Asp Gly Cys Val Pro Leu Asn He 
4025 4030 ■ 4035 



He Pro Leia Thr Thr Ala Ala Lys Leu Met Val Val Val Pro Asp 
4040 4045 4050 



Tyr Gly Thr Tyr Lys Asn Thr Cys Asp Gly Asn Thr Phe Thr Tyr 
4055 4060 4065 



Ala Ser Ala Leu Trp Glu He Gin Gin Val Val Asp Ala Asp Ser 
4070 4075 4080 



Lys He Val Gin Leu Ser Glu He Asn Met Asp Asn Ser Pro Asn 
4085 4090 4095 
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Leu Ala Trp Pro Leu lie Val . Thr Ala Leu Arg Ala Asn Ser Ala 
4100 • 4105 4110 



Val Lys Leu Gin Asn Ash Glu Leu Ser Pro Val Ala Leu Arg Gin 
4115 4120 4125 



Met Ser Cys Ala Ala Gly Thr Thr Gin Thr Ala Cys Thr ■ Asp Asp 
4130 4135 4140 



Asn Ala Leu Ala Tyr Tyr Asn Asn Ser Lys Gly Gly Arg Pbe Val 
4145 4150 4155 



Leu Ala Leu Leu Ser Asp His Gin Asp "Leu Lys Trp Ala Arg Phe 
4160 4165 4170 



Pro Lys Ser Asp Gly Thr Gly Thr lie Tyr Thr Glu Leu Glu Pro 
4175 4180 4185 



Pro Cys Arg Phe Val Thr Asp Thr Pro Lys Gly Pro Lys Val Lys 
4190 4195 .4200 



Tyr* Leu Tyr Phe lie Lys Gly Leu Ash Asn Leu Asn Arg Gly -Met 
^'205 " 4210 4215 



Val Leu Gly Ser Leu Ala Ala. Thr Val Arg Leu Gin Ala Gly Ash 
4220 4225 4230 



Ala Thr Glu Val Pro Ala Asn Ser Thr Val Leu Ser Phe Cys Ala 
4235 4240 4245 



Phe Ala Val, Asp Pro Ala Lys Ala Tyr Lys Asp Tyr ' Leu Ala Ser 
4250 4255 4260 



Gly Gly Gin Pro lie Thr Asn Gys Val Lys Met Leu Cys Thr His 
4265 * 4270 4275 



Thr Gly Thr Gly Gin Ala lie Thr Val Thr Pro Glu Ala Asn Met 
4280. • 4285 4290 



Asp Gin Glu Ser Phe Gly Gly Ala Ser Cys Cys Leu Tyr Cys Arg 
4295 4300 4305 



Cys His lie Asp His Pro Asn Pro Lys Gly Phe Cys Asp Leu Lys 
4310 4315 4320 
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Gly Lys Tyr Val Gin lie* Pro Thr Thr Cys Ala Asn Asp Pro Val 
4325 4330 4335 



Gly Phe Thr Leu Arg Asn Thr Val Cys Thr Val • Cys Gly Met Trp 
4340 4'345 4350 



Lys Gly Tyr Gly .Cys Ser. Cys Asp Glri Leu Arg Glu Pro Leu Met- 
* 4355 4360 4365 



Gin Ser Ala Asp Ala Ser Thr Phe 
4370 4375 



<210> 64 

<211> 2697 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<4G0> 64 ' ' ' 

Phe Lys Arg Val Cys Gly Val Ser Ala Ala Arg Leu- Thr Pro Cys Gly 
1-5 10 15 



Thr Gly Thr Ser Thr Asp Val Val Tyr Arg Ala Phe Asp lie Tyr Asn 
20 25 . 30 

Glu Lys Val Ala Gly Phe Ala Lys Phe Leu Lys Thr Asn Cys Cys Arg 
35 40 45 



Phe Gin Glu Lys Asp Glu Glu Gly Asn Leu Leu Asp Ser Tyr Phe Val 
.50 55 60 



Val Lys Arg His Thr Met, Ser Asn Tyr Gin His Glu Glu Thr lie Tyr 
65 70 75 80 



Asn Leu Val Lys Asp Cys Pro Ala Val Ala Val His Asp Phe Phe Lys 
85 90 95 



Phe Arg Val Asp Gly Asp Met Val Pro His lie Ser Arg Gin Arg Leu 
100 105 110 



Thr Lys Tyr Thr Met Ala Asp Leu Val Tyr Ala Leu Arg His Phe Asp 
115 120. 125 



Glu Gly Asn Cys Asp Thr Leu Lys Glu lie Leu Val Thr Tyr Asn Cys 
130 135 140 



Cys Asp Asp Asp Tyr Phe Asn Lys Lys Asp Trp Tyr Asp Phe Val Glu 
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145 



150 



.155 



160 



Asn Pro Asp He Leu Arg Val Tyr Ala Asn Leu Gly Glu Arg Val Arg 
165 ,170 175 



Gin Ser Leu Leu Lys Thr Val Gin Phe Cys Asp Ala Met Arg Asp Ala 
180 185 190 



Gly He Val Gly Val Leu Thr Leu Asp Asn Gin Asp Leu Asn Gly Asn 
195 200 205 



Trp Tyr Asp Phe Gly Asp Phe Val Gin Val Ala Pro Gly Cys Gly Val 
210 . 215 220 



Pro He Val Asp Ser Tyr Tyr Ser Leu Leu Met Pro He Leu Thr Leu 
225 230 235 240 



Thr Arg Ala Leu Ala Ala Glu Ser His Met Asp Ala Asp Leu Ala Lys 
245 250 255 



Pro Leu He Lys Trp Asp Leu Leu Lys Tyr Asp Phe Thr Glu Glu Arg 
260 ■ 265 270' 



Leu Cys Leu Phe Asp Arg Tyr Phe Lys Tyr Trp Asp Gin Thr Tyr His 
275 280 285 



Pro Asn Cys He Asn Cys Leu Asp Asp Arg Cys He Leu His Cys Ala 
290 295 -300 



Asn Phe Asn Val Leu Phe Ser Thr Val Phe Pro Pro Thr Ser Phe Gly 
305 ' 310 315 320 



Pro Leu Val Arg Lys He Phe Val Asp Gly Val Pro Phe Val- Val Ser 
325 330 335 



Thr Gly Tyr His Phe Arg Giu Leu Gly Val Val His Asn Gin Asp Val 
340 345 350 



Asn Leu His Ser Ser Arg Leu Ser Phe Lys Glu Leu Leu Val Tyr Ala 
355 360 365 



Ala Asp Pro Ala Met His Ala Ala Ser Gly Asn Leu Leu Leu Asp Lys 
370 375 380 



Arg Thr Thr Cys Phe Ser Val Ala Ala Leu Thr Asn Asn Val Ala Phe 
385 390 395 400 



164 



wo 2004/096842 



PCT/CA2004/000626 



Gin Thr Val Lys Pro Gly Asn Phe Asn Lys Asp Phe Tyr Asp Phe Ala 
. 405 410 415 



Val Ser Lys Gly Phe Phe' Lys Glu Gly Ser Ser Val Glu Leu Lys His 
420 425 430 



Phe Phe Phe Ala Gin Asp Gly Asn Ala Ala lie Ser Asp Tyr Asp Tyr 
.435 440 445 



Tyr Arg Tyr Asn- Leu Pro Thr Met Cys Asp lie Arg. Gin Leu Leu Phe 
.450 455 460 



Val Val Glu Val Val Asp. Lys Tyr Phe Asp Cys Tyr Asp Gly Gly Cys 
465 470 475 480 



lie Asn Ala Asn Gin Val He Val Asn Asn Leu Asp Lys Ser Ala Gly 
485 -490 495 



Phe Pro Phe Asn Lys Trp Gly Lys Ala Arg Leu Tyr Tyr Asp Ser Met 
500 505 510 



Ser Tyr Glu Asp Gin Asp Ala Leu Phe Ala Tyr Thr Lys Arg Asn Val 
515 520 525 



He Pro Thr He Thr Gin Met Asn Leu Lys Tyr Ala lie Ser Ala Lys 
530 535 540' 



Asn Arg Ala Arg Thr Val Ala Gly Val Ser He Cys Ser Thr Met Thr 
545 550 555 560 



Asn Arg Gin Phe His Gin. Lys Leu Leu Lys Ser He Ala Ala Thr Arg 
565 570 ' 575 



Gly Ala Thr Val Val He Gly Thr Ser Lys Phe Tyr Gly Gly Trp His 
580 585 590 



Asn Met Leu Lys Thr Val Tyr Ser Asp Val Glu Thr Pro His Leu Met 
595 • 600 ■ ■ 605 . 



Gly Trp Asp Tyr Pro Lys Cys Asp Arg Ala Met Pro Asn Met Leu Arg 
610 615 620 



He Met Ala Ser Leu Val Leu Ala Arg Lys His Asn Thr Cys Cys Asn 
625 630 635 640 
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Leu Ser His Arg Phe Tyr Arg Leu Ala Asn Glu Cys Ala Gin Val Leu 
645 650 . . .655 - 

Ser Glu- Met Val Met Cys Gly Gly Ser Leu Tyr Val Lys ' Pro Gly Gly 
660 . 665 670 



Thr Ser Ser Gly Asp Ala Thr Thr Ala Tyr Ala Asn Ser Val Phe Asn 
675 680 685 

lie Cys Gin Ala Val, Thr Ala Asn Val Asn Ala Leu Leu Ser Thr Asp 
' 690 695 700 



Gly Asn Lys lie Ala Asp Lys' Tyr Val Arg Asn Leu Gin His Arg Leu 
.705 710 715 720 

Tyr Glu Cys Leu Tyr Arg Asn Arg Asp Val Asp His Glu Phe Val Asp 
725 730 735 

Glu Phe Tyr Ala Tyr Leu Arg Lys His Phe Ser Met Met lie Leu Ser 
740 745 750 

Asp Asp Ala Val Val Cys Tyr Asn Ser Asn Tyr Ala Ala Gin Gly Leu 
755 760 . I 765 

Val Ala Ser lie Lys Asn Phe Lys Ala Val Leu Tyr Tyr Gin Asn Asn 
. 770- 775 780 • • 



Val Phe Met Ser Glu Ala Lys Cys ^ Trp Thr Glu Thr Asp Leu Thr Lys 
785 790 795 800 

Gly Pro His Glu Phe Cys Ser Gin His Thr Met Leu Val Lys Gin Gly 

805 810 815 

Asp Asp Tyr Val Tyr- Leu Pro Tyr Pro Asp Pro Ser Arg lie Leu Gly 

820 825 830 



Ala Gly Cys Phe Val Asp Asp He Val Lys Thr Asp Gly Thr Leu Met 
835 840 845 * 



lie Glu Arg Phe Val Ser Leu Ala He Asp Ala Tyr Pro Leu Thr Lys 
850 855 860 



His Pro Asn Gin Glu Tyr Ala Asp Val Phe His Leu Tyr Leu Gin Tyr 
865 870 875 880 
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He Arg Lys Leu Hi5 Asp Glu Leu fhr Gly His Met Leu Asp Met Tyr 
885 890 - . : • 895 

Ser Val Met Leu Thr Asn Asp. Asn Thr Ser Arg Tyr Trp '.Glu Pro Glu 
, 300 905 910 

Phe Tyr Glu Ala Met Tyr Thr Pro His Thr Val Leu Gin Ala Val Gly 

.920 925 ; 

Ala Cys Val Leu Cys Asn Ser Gin Thr". Ser Leu Arg Cys Gly Ala Cys 
930 . 935 940 

He Arg Arg Pro Phe Leu Cys Cys Lys Cys Cys Tyr Asp His Val He 
^45 950 955 960 

Ser Thr Ser His Lys Leu Val Leu Ser Val Asn Pro Tyr Val Cys Asn 
965 970 975 

Ala Pro Gly Cys Asp Val Thr Asp Val Thr Gin Leu Tyr Leu Gly Gly 
980 985 990 

Met Ser Tyr Tyr Cys - Lys Ser His Ly& Pro Pro He Ser Phe. Pro Leu 
995 - 1000 1005 

V , • 

Cys Ala Asn Gly Gin Val Phe Gly Leu Tyr Lys Asn Thr Cys Val 
1010 1015 • .1020 . 

Gly Ser Asp Asn Val Thr Asp Phe Asn Ala He Ala Thr Cys Asp 
.1025 1030 1035 

Trp Thr Asn Ala Gly Asp Tyr He Leu Ala Asn Thr Cys Thr Glu 
1040 1045 1050 

Arg Leu Lys Leu Phe Ala Ala Glu Thr Leu Lys Ala Thr Glu Glu 
1055 1060 1065 

Thr Phe Lys Leu Ser Tyr Gly He Ala Thr Val Arg Glu Val Leu • 
1070 1075 1080 

Ser Asp Arg Glu Leu His Leu Ser Trp Glu Val Gly Lys Pro Arg 
1085 1090 1095 

Pro Pro Leu Asn Arg Asn Tyr Val Phe Thr Gly Tyr Arg Val Thr 
1100 1105 1110 

Lys Asn Ser Lys Val Gin He Gly Glu Tyr Thr Phe Glu Lys Gly 
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1115 . 1120 .1125 



Asp Tyr Gly Asp Ala Val Val Tyr Arg Gly Thr Thr Thr Tyr Lys 
1130 1135, 1140 



Leu Asn Val Gly Asp Tyr Phe Val Leu Thr Ser His Thr Val Met 
1145 1150 1155 



Pro Leu Ser Ala Pro Thr Leu Val Pro Gin Glu His Tyr Val Arg 
1160 1165 1170 



lie Thr Gly Leu Tyr Pro Thr Leu Asn lie Ser Asp Glu Phe Ser 
1175 . 1180 1185 



Ser Asn Val Ala Asn Tyr Gin Lys Val Gly Met Gin Lys Tyr Ser 
1190 1195 1200 



Thr Leu Gin Gly Pro Pro Gly Thr Gly Lys Ser His Phe Ala He 
1205 1210 1215 



Gly Leu Ala Leu Tyr Tyr Pro Ser Ala Arg He Val Tyr Thr Ala 
1220 1225 1230 



Cys Ser His Ala Ala Val Asp Ala Leu Cys Giu Lys Ala Leu Lys 
1235 1240 12.45 



Tyr Leu Pro He Asp Lys Cys Ser Arg 116 He Pro Ala Arg Ala 
1250 - 1255 ' 1260 



Arg Val Glu Cys Phe "Asp Lys Phe Lys Val Asn Ser Thr Leu Giu 
1265 1270 1275 



Gin Tyr Val Phe Cys Thr Val Asn Ala Leu Pro Glu Thr Thr Ala 
1280 1285 1290 



Asp He * Val Val Phe Asp Glu He Ser Met Ala Thr Asn Tyr Asp 
1295 1300 1305 



Leu Ser.- Val Val Asn Ala Arg Leu Arg Ala Lys His Tyr Val Tyr 
1310 1315 1320 



He Gly Asp Pro Ala Gin Leu Pro Ala Pro Arg Thr Leu Leu Thr 
1325 1330 1335 



Lys Gly Thr Leu Glu Pro Glu Tyr Phe Asn Ser Val Cys Arg Leu 
1340 1345 1350 
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Met Lys Thr lie Gly Pro, Asp Met Phe Leu Gly Thr Cys Arg Arg 
1355 1360 1365 

Cys Pro Ala Glu .lie Val Asp Thr Val Ser Ala Leu Val Tyr Asp 
1370 . 1375 1380 

Asn Lys Leu Lys Ala His Lys Asp Lys Ser Ala Gin Cys Phe Lys 
1385 1390 1395 

Met Phe Tyr Lys Gly Val He Thr His Asp Val Ser Ser JUa lie 
1400 1405 1410 

.Asn Arg Pro Gin lie Gly Val Val Arg Glu Phe Leu Thr Arg Asn 
1415 • 1420 1425 

Pro Ala Trp Arg Lys. Ala Val Phe He Ser Pro Tyr Asn Ser Gin 
1430 1435 2440 

Asn Ala Val Ala Ser Lys He Leu Gly Leu Pro . Thr Gin Thr' Val 
. 1445 1450 1455 

Asp Ser Ser Gin Gly Ser Glu Tyr«Asp Tyr Val He Phe Thr Gin 
1*^0 14 65 1470 

Thr Thr Glu Thr Ala His Ser .Cys Asn Val Asn Arg- Phe Asn Val 
1475 1480 1485 

Ala He Thr Arg Ala Lys He Gly He Leu Cys He Met Ser Asp 
■ 1490 1495 1500 

Arg Asp Leu Tyr Asp Ilys Leu Gin Phe Thr Ser Leu Glu He Pro 
' 1505 1510 1515 

Arg Arg Asn Val Ala Thr Leu Gin Ala Glu Asn Val Thr Gly Leu 
1520 ■ 1525 1530 

Phe Lys Asp Cys Ser Lys He He Thr Gly Leu His Pro Thr Gin 
1535 • • 1540 ,1545- 

Ala Pro Thr His Leu Ser Val Asp He Lys Phe Lys Thr Glu Gly 
1550 1555 i5go 

Leu Cys Val Asp He Pro Gly lie Pro Lys Asp Met Thr Tyr Arg 
1565 1570 1575 
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Arg Leu He Ser Met Met' Gly Phe Lys Met Asn Tyr Gin Val Asn 
1580 1585 1590 



Gly Tyr Pro Asn Met Phe He Thr Arg Glu Glu Ala He Arg His 
1595 ■ 1600 1605 



Val Arg Ala Trp He Gly. Phe Asp Val Glu Gly Cys His Ala Thr 
1610 1615 1620 

Arg Asp Ala Val Gly Thr Asn Leu Pro Leu Gin Leu Gly Phe Ser 
1625 1630 1635 



Thr Gly Val Asn Leu Val Ala Val Pro Thr Gly Tyr Val Asp Thr 
1640 1645 1650 

Glu Asn Asn Thr Glu Phe Thr Arg Val Asn Ala Lys. Pro Pro Pro 
1655 1660 1665 

Gly Asp Gin Phe Lys His Leu He Pro Leu Met Tyr Lys Gly Leu 
1670 1675 1680. 

Pro Trp Asn Val Val Arg He Lys He Val Gin- Met Leu. Ser Asp 
1685 1690 1695 

Thr Leu Lys Gly Leu Ser Asp Arg Val Val Phe . Val Leu Trp Ala 
1700 1705 1710 



His Gly. Phe Glu Leu Thr Ser Met Lys Tyr Phe Val Lys He Gly 
1715 1720 1725 

Pro Glu Arg Thr Cys Cys Leu Cys Asp Lys Arg Ala Thr Cys Phe 
1730 1735 1740 

Ser Thr Ser Ser Asp Thr Tyr Ala Cys Trp Asn His Ser Val Gly 
1745 1750 1755 

Phe Asp Tyr Val Tyr Asn Pro Phe Met He Asp Val Gin Gin Trp ' 
1760 1765 1770 

Gly Phe Thr Gly Asn Leu Gin Ser Asn His Asp Gin His Cys Gin 
1775 1780 1785 



Val His Gly Asn Ala His Val Ala Ser Cys Asp Ala He Met Thr 
1790 1795 1800 
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Arg Cys Leu Ala Val His Glu Cys Phe Val Lys -Arg Val Asp Trp 
1805 ■ 1810 . 1815 ' 

Ser Val Glu Tyr Pro He He .Gly Asp Glu Leu Arg Val Asn Ser 
* 1820 1825^ 1830 * 

Ala Cys Arg Lys Val Gin His Met Val Val Lys Ser Ala Leu Leu 
1835 1840 1845 

Ala Asp Lys Phe Pro Val Leu His Asp lie Gly Asn Pro Lys Ala 
1850 1855 I860 

He Lys Cys Val Pro Gin Ala Glu Val Glu Trp Lys Phe Tyr Asp 
1865 1870 1875 

Ala Gin .Pro Cys Ser Asp Lys Ala Tyr Lys He Glu Glu Leu Phe 
1880 . 1885 . 1890 

Tyr Ser Tyr Ala Thr His His Asp Lys Phe Thr Asp Gly Val Cys 
1895 1900 1905 

Leu Phe Trp Asn Cys Asn Val Asp Arg Tyr .Pro' Ala Asn Ala He 
1910 1915 , 1920 

' . • ' » - 

Val Cys Arg Phe Asp Thr Arg Val Leu Ser Asn Leu Asn Leu Pro 
1925 1930 ■ 1935 

Gly Cys Asp Gly Gly Ser Leu Tyr Val Asn Lys His Ala Phe His 
1940 1945 1950 

Thr Pro Ala Phe Asp Lys Ser Ala Phe Thr Asn Leu Lys Gin Leu 
1955 I960 1965 

Pro Phe Phe Tyr Tyr Ser Asp Ser Pro Cys Glu Ser His Gly Lys 
1970 1975 1980 

Gin Val Val Ser Asp He Asp Tyr Val Pro Leu Lys Ser Ala Thr' 
1985 1990 1995 

Cys He Thr Arg Cys Asn Leu Gly Gly Ala Val Cys Arg His His 
2000 2005 2010 

Ala Asn Glu Tyr Arg Gin Tyr Leu Asp Ala Tyr Asn Met Met He 
2015 2020 2025 

Ser Ala Gly Phe Ser Leu Trp He Tyr Lys Gin Phe Asp Thr Tyr 
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2030 2035 2040 

Ash Leu Trp Asn Thr Phe Thr Arg -Leu Gin Ser Leu Glu Asn Val , 
2045 ■ , 2050 ' • 2055 

Ala Tyr Asn Val Val Asn Lys Gly His Phe Asp Gly His Ala Giy 
2060 2065 2070 ' 

Glu Ala . Pro Val Ser lie lie Asn Asn Ala Val Tyr Thr Lys Val 
2075 2080 2085 , ' 

Asp Gly lie Asp Val Glu lie Phe Glu Asn Lys Thr Thr Leu Pro 
2090 2095 2100 

Val Asn Val Ala Phe Glu Leu- Trp Ala Lys Arg Asn He Lys Pro 
2105 • 2110 2115 

Val Pro Glu" He Lys He Leu Asn Asn Leu Gly Val Asp He Ala ' 
2120 2125 . 2130 

Ala Asn Thr Val He Trp Asp Tyr Lys Arg Glu Ala Pro Ala His 

2135 2140 • 2145 • 

Val §er . Thr He' Gly Val Cys Thr Met Thr Asp He - Ala Lys Lys 
2150' 2155 2160 

Pro Thr Glu Ser Ala Cys Ser Ser Leu Thr Val Leu Phe Asp Gly 
2165 2170 2175 

Arg Val Glu Gly Gin Val Asp Leu Phe Arg Asn Ala Arg Asn Gly 
2180 2185 2190 



Val Leu He Thr Glu Gly Ser Val Lys Gly Leu Thr Pro Ser Lys 
. 2195 2200 2205 



Gly Pro Ala Gin Ala Ser Val Asn Gly Val Thr Leu He Gly Glu' 
2210 2215 2220 



Ser Val . Lys Thr Gin Phe Asn Tyr Phe Lys Lys Val Asp Giy He 
2225 2230 2235 



He Gin Gin Leu Pro Glu Thr Tyr Phe Thr Gin Ser Arg Asp Leu 
2240 2245 2250 



Glu Asp Phe Lys Pro Arg Ser Gin Met Glu Thr Asp Phe Leu Glu 
2255 2260 2265 
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Leu Ala Met Asp Glu Phe lie Gin Arg Tyr Lys Leu Glu Gly Tyr 
227.0 2275 ♦ 2280 

Ala Phe Glu His jle.Val Tyr Gly Asp Phe Ser His Gly Gin Leu 
2285 . 2290 2295 



Gly Gly Leu His Leu Met lie Gly Leu Ala Lys Arg Ser Gin Asp 

2300, 2305 2310 

Ser Pro Leu Lys Leu Glu Asp Phe lie Pro Met Asp Ser Thr Val 

2315 2320 2325 



Lys Asn Tyr Phe He Thr Asp Ala Gin Thr Gly Ser Ser Lys Cys 

2330 ' 2335 2340 

Val Cys Ser Val lie Asp Leu Leu Leu Asp Asp Phe Val Glu lie 

2345 2350 2355 



Il.e Lys Ser Gin Asp Leu Ser Val lie Ser Lys Val Val -Lys ' Val 
2360 2365 2370 

Thr lie Asp Tyr Ala Glu lie Ser > Phe Met Leu Trp Cys Lys Asp 

2315 2380 . 2385 



Gly His Val Glu Thr Phe Tyr Pro Lys Leu Gin Ala- Ser Gin Ala 
2390 2395 2400 



Trp Gin Pro Gly Val Ala Met Pro Asn Leu Tyr Lys Met Gin Arg 
2405 2410 2415 

Met Leu Leu Glu Lys Cys Asp Leu Gin Asn Tyr Gly Glu Asn Ala 
' 2420 . 2425 2430 

Val He Pro Lys Gly He Met Met Ash Val Ala Lys Tyr Thr Gin 
2435 2440 2445 

Leu Cys Gin Tyr Leu Asn Thr Leu Thr Leu Ala Val Pro Tyr Asn 
2450 2455 2460 

Met Arg Val He His Phe Gly Ala Gly Ser Asp Lys Gly Val Ala 
2465 2470 2475 



Pro Gly Thr Ala Val Leu Arg Gin Trp Leu Pro Thr Gly Thr Leu 
2480 2485 2490 
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. Leu Val Asp Ser Asp Leu' Asn Asp Phe Val Ser Asp Ala Asp Ser 
2495 2500 . . 2505 



Thr Leu lie Gly Asp Cys Ala Thr Val His Thr Ala Asn Lys Trp 
2510 ■ 2515 2520 



Asp Leu lie lie Ser Asp .Met Tyr Asp Pro Arg Thr Lys His Val 
2525 2530 2535 



Thr Lys Glu Asn Asp Ser Lys Glu Gly Phe Phe Thr Tyr Leu Cys 
2540 2545 2550 



Gly Phe He Lys Gin Lys Leu Ala Leu Gly Gly Ser He Ala Val 
2555 2560 2565 



Lys He Thr Glu His Ser Trp Asn Ala Asp Leu Tyr Lys Leu Met 
2570 2575 2580 



Gly His Phe Ser Trp Trp Thr Ala Phe Val Thr Asn Val Asn Ala 
2585 2590 . 2595 



Ser Ser Ser Glu Ala Phe Leu He Gly Ala" Asn Tyr Leu Gly Lys 
2600 2605 2 610 



Pro Lys Glu Gin He Asp Gly Tyr Thr Met His -Ala Asn Tyr lie 
2615 • 2620 . 2625 



Phe Trp Arg Asn Thr Asn Pro He Gin Leu Ser Ser Tyr Ser Leu 
2630 2635 2640 



Phe Asp Met Ser Lys Phe Pro Leu Lys Leu Arg Gly Thr Ala Val 
2645 . '2650 2655' 



Met Ser Leu Lys Glu Asn Gin He Asn Asp Met He Tyr Ser Leu 
2660 2.665 2670 



Leu Glu Lys Gly Arg Leu He He Arg Glu Asn Asn Arg Val Val 
2675 2680 2685 



Val Ser Ser Asp He Leu Val Asn Asn 
2690 2695 



<210> 65 

<211> 274 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
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<:400> 65 • . . . . , 

Met Asp Leu Phe Met Arg Phe Phe Thr Leu Arg Ser He Thr Ala Gin 

Pro Val Lys He Asp Asn Ala Ser Pro Ala Ser Thr Val His Ala- Thr 
20 25 30.. 

Ala Thr Jle Pro Leu Gin Ala Ser Leu /Pro Phe Gly Trp Leu Val He 

= 35 40 \ : 45 

Gly Val Ala Phe Leu Ala Val Phe Gin Ser Ala Thr Lys- He He Ala 
50 '55 60 



Leu Asn Lys Arg Trp Gin Leu Ala Leu Tyr Lys Gly Phe Gin Phe He 
. • • '70 75 . 80 

Cys Asn Leu Leu Leu Leu Phe Val Thr lie Tyr Ser His Leu Leu Leu' 
85 90 * • 95 

Val Ala Ala Gly Met Glu Ala Gin Phe Leu Tyr Leu Tyr Ala Leu lie 
100 105 ' 110 

Tyr ?h.e Leu GIi\. Cys He Pisn Ala Cys Arg lie He Met Arg Cys Trp 
115 120 125 

Leu Cys Trp Lys Cys Lys Ser Lys Asn Pro Leu Leu Tyr Asp Ala Asn 
130 135 140 

Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr Cys He Pro Tvr 
145 150 ■ 155 160 

Asn Ser Val Thr Asp Thr He Val Val Thr Glu Gly Asp Gly He Ser 
165 170 175 

Thr Pro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly Tyr Ser Glu Asp 
180 185 190 

Arg His Ser Gly Val Lys Asp Tyr Val Val Val His' Gly Tyr Phe Thr 
195 200 205 

Glu Val Tyr Tyr Gin Leu Glu Ser Thr Gin He Thr Thr Asp Thr Glv 
210 215 220 

He Glu Asn Ala Thr Phe Phe He Phe Asn Lys Leu Val Lys Asp Pro 
225 230 235 240 
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Pro Asn Vai Gin He His Thr lie Asp Gly Ser Ser.Gly Val Ala Asn 
245 250 255 



Pro Ala Met Asp Pro lie Tyr Asp Glu Pro Thr- Thr Thr Thr Ser Val 
260 . 265 270 • 



Pro Leu 



<210> 66 

<211> 154 

<212> PRT 

<213> Severe acute 'respira1:ory syndrome virus 

<400> 66 

Met Met Pro Thr Thr Leu Phe Ala Gly Thr His lie Thr Met Thr Thr 
1 ' 5 . 10 15 



Val Tyr His He Thr Val Ser Gin He Gin Leu Ser Leu Leu Lys Val 

20 ■ 25 ' 30 * 



Thr Ala Phe Gin His Gin Asn Ser Lys Lys Thr Thr Lys Leu Val Val 
35. 40 ' ' 4,5 



lie Leu. Arg He Gly Thr Gin Val Leu Lys Thr Met Ser Leu Tyr Met 
50 ■ 55 60 • " 



Ala He Ser Pro Lys Phe Thr Thr Ser Leu Ser Leu His Lys Leu Leu 
65 70 75 80 



Gin Thr Leu Val Leu Lys Met Leu His Ser Ser Ser Leu Thr Ser Leu 
85 90 95 



Leu Lys Thr His Arg Met Cys Lys Tyr Thr Gin Ser Thr Ala Leu Gin 
100 105 * * 110 



Glu Leu Leu He Gin Gin Trp He Gin Phe Met Met Ser Arg Arg Arg 
115 120 125 



Leu Leu Ala Cys Leu Cys Lys His Lys Lys Val Ser Thr Asn Leu Cys 
130 135 140 



Thr His Ser Phe Arg Lys Lys Gin Val Arg 
145 150 
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67 • • . 

63 • 
PRT 

Severe acute respiratory syndrome virus 
<400> 67 ■ 

Met Phe His Leu Val Asp Phe Gin Val Thr He Ala Glu He Leu lie 
1 5 10 ..15 



He lie .Met Arg Thr Phe Arg lie Ala lie Trp Asn Leu Asp Val He 
20 25 *. 30 



He JSer Ser He Val Arg Gin Leu Phe Lys Pro Leu Thr Lys Lys Asn 
35 ' 40 • 45 ' 



Tyr Ser Glu Leu Asp Asp Glu Glu Pro Met Glu Leu Asp Tyr Pro 
50 55 60 ' 



<210> 6B ' ■ ' ' 

<211> 122 
<212> PET . 

<213> Severe acute respiratory syndrome virus 

<400> 68 . . ■ " * . 

Met Lys He He Leu Phe Leu Thr Leu He Val Phe Thr Ser Cys. Glu 



Leu Tyr His Tyr Gin Glu Cys' Val Arg Gly Thr Thr Val Leu Leu Lys 
20 25 30 



(plu Pro Cys Pro Ser Gly Thr Tyr Glu Gly Asn Ser Pro Phe His Pro 
35 40 45 



Leu Ala Asp Asn Lys Phe. Ala Leu Thr Cys Thr Ser Thr His Phe Ala 
50 55 60 



Phe Ala Cys Ala Asp Gly Thr Arg His Thr Tyr Gin Leu Arg Ala Arg 
65 70 .75 80 



Ser Val Ser Pro Lys Leu Phe He Arg Gin Glu Glu Val Gin Gin Glu 
85 90 95 



Leu Tyr Ser Pro Leu Phe Leu He Val Ala Ala Leu Val Phe Leu He 
100 105 110 



Leu Cys Phe Thr He Lys Arg Lys Thr Glu 
115 120 
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<210> 69 • 

<211> 4 4 

<212:> ,PRT 

<213> Severe acute respiratory syndrome virus 

<400> 69 .* 

Met Asn Glu Leu Thr Leu lie Asp Phe Tyr Leu Cys Phe Leu Ala Phe 
1.5 .10 15 



Leu Leu Phe Leu Val Leu lie Met Leu lie lie Phe Trp Phe Ser Leu 
20 ,25 30 



Glu lie Gin Asp Leu Glu, Glu Pro Cys Thr Lys Val 
35 • 40 - 



<210> 70 

<211> 39 

<212> PRT 

'<213> Severe acute respiratory syndrome virus 

<4.00> 70 

Met Lys Leu Leu lie Val Leu Thr Cys lie Ser Leu Cys Ser Cys lie 
1 5 10 • 15 



Cys Thr Val Val Gin krg Cys Pila Sex Asn Lys Pro His Val Leu Glu 
.20. 25 ■ 30 ■ 



Asp Pro Cys Lys Val Gin His 





35 


<210> 


71 


<211> 


84 


<212> 


PRT 


<213> 


Severe acute 


<400> 


71 


Met Cys 


Leu Lys lie 


1 


5 



10 15 



Ser Thr Ala Trp Leu Cys Ala Leu Gly Lys Val Leu Pro Phe His Arg 
20 25 30 



Trp His Thr Met Val Gin Thr Cys Thr Pro Asn Val Thr He Asn Cys 
35 40 45 



Gin Asp Pro Ala Gly Gly Ala Leu He Ala Arg Cys Trp Tyr Leu His 
50 55 60 
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Glu Gly His Gin Thr Ala Ala Phe Arg Asp Val Leu Val Val Leu Asn 
65 ■ . 70 75 80 



Lys Arg Thr Asn 



<210> 72 • 

<211> 98 

<212> PRT 

<213> Severe- acute respiratory syndrome virus * 

<400> -72 ^ ' 

Met Asp Pro Asn Gin Thr Asn Val Val Pro Pro Ala Leu His Leu Val 
1 . 5 10 • 15 ■' 



Asp Pro Gin He Gin Leu Thr He Thr Arg Met Glu Asp" Ala Met Gly 
20 25 ' 30 • . 

Gin Gly. Gin Asn Ser Ala Asp Pro Lys Val Tyr Pro lie He Leu Arg • 
35 40 45 

Leii Gly Ser Gin Leu Ser Leu Ser Met Ala Arg Arg Asn Leu Asp Ser • 
50 55 60 

Leu Glu Ala Arg Ala Phe Gin Ser Thr Pro lie Val- Val Gin Met Thr 
65 70 75 80 

Lys Leu Ala Thr Thr Glu Glu Leu Pro Asp Glu Phe Val Val Val Thr 
85 90 95 

» 

Ala Lys 



<210> 73 

<211> 70 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 73 

Met Leu Pro Pro Cys Tyr Asn Phe Leu Lys Glu Gin His Cys Gin Lys 
1 5 10 15 



Ala Ser Thr Gin Arg Glu Ala Glu Ala Ala Val Lys Pro Leu Leu Ala 
20 25 30 

Pro His His Val Val Ala Val He Gin Glu He Gin Leu Leu Ala Ala 
35 40 45 
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Val Gly Glu lie Leu Leu .Leu Glu 
50 55 



Pro Ser Arg Tyr Cys Cys 

65 70 . 



<210> 74 

<211> 6 

<212> RNA 

<213> Coronavirus 

<400> 74 
cuaaac 



<210> 75 

<211> 13 

<212> PRT 

<213> Severe acute respiratory 

<400> 75 

Met Phe He Phe Leu Leu Phe Leu 
1 5 



Trp Leu Ala Giu.Val Val Lys Leu 
60 



syndrome virus 



Thr Leu Thr Ser Gly, 
• 10 



<:210> 76 . " *. ■ . [' • . 

<211>' 23 ■ ■ 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 76 

Thr He Pro Leu Gin Ala Ser Leu Pro Phe Gly Trp Leu Val He Gly 
1 5 10 15 ' 



Val Ala Phe Leu Ala Val Phe 
20 



<210> 77 

<211> 23 . . ' ' 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 77 

Phe Gin Phe He Cys Asn Leu Leu Leu Leu Phe Val Thr He Tyr Ser 
15 10 15 



His Leu Leu Leu Val Ala Ala 
20 



<210> 78 
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<211> 23 . * . ' 

<212> PRT . ' ' . 

<213> Severe acute respiratory syndrdine virus 

<400> 76 . ' . 

Ala Gin Phe Leu tyr Leu Tyr Ala Leu lie Tyr Phe Leu Gin Cys lie 
1 * 5 . IQ 15 



Asn Ala Cys Arg He He Met 
20 



<210> 7 9 

<211> 18 ' - 

<212> PRT 

<213> ' Severe acute respiratory syndrome virus 
<400> 79 ' ' 

Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Val Thr Leu Ala 
1-5 10- 15 



lie Leu 



<210> 80 

<211> 23 ' \ ■ ' I ' ". . ' 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 80 

•Leu Leu Glu Gin Trp Asn Leu Val He Gly Phe Leu Phe Leu Ala Trp 
i 5 . 10 15 



He Met Leu Leu Gin Phe Ala 
20 



<210> 81 

<211> 23 . * 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 81 

Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala Cys Phe Val 
1 5 10 15 



Leu Ala Ala Val Tyr Arg He 
20 



<210>' 82 
<211> 23 
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<212> 
<:213> 



PRT 



Severe acute .respiratory syndrome virus • 



<400> 



82 



Gly Gly lie 
1 



Ala lie Ala Met Ala Cys lie Val Gly Leu Met Trp Leu 
5 .10 15 



Ser Tyr Phe Val Ala Ser Pbe 
20 



<210> 83 
<211> 20 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 83 

' His Leu Val Asp Phe Gin Val Thi: He Ala Glu He Leu He He He 
15 10 15 



Met Arg Thr Phe 
20 



<210> 84 . . 

<211> 15 

<212> PRT • . ■ 

<213> Se*^exe acute respixatory syndrome virus ' 

<400> 84 

Met Lys He He Leu Phe Leu Thr Leu He Val Phe Thr Ser Cys 
1 5 10 15 



<210> 85 

<211> 19 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 85 

Ser Pro Leu Phe Leu He Val Ala Ala Leu Val Phe Leu He Leu Cys 
1 5 10 ■ 15 



Phe Thr He 



<210> 86 

<211> 83 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 86 
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Glu Leu Tyr' His Tyr Gin Glu Cys Val Arg .Gly Thr Thr Val Leu Leu 

1 5 • 10 .. * ' - 15 • 



Lys Glu Pro. Cys Pro Ser Gly Thr Tyr .Glu Gly Asn Ser Pro Phe His 
■ 20 ■ "25 ■ • '30 . 



Pro Leu Ala Asp Asn Lys Phe Ala Leu Thr Cys Thr Ser Thr -His Phe 
35 40 45 



Ala Phe Ala Cys Ala Asp Gly Thr Arg His Thr ,Tyr Gin Leu Arg Ala 
50 55 60 



Arg Ser Val Ser Pro Lys Leu Phe lie Arg Gin Glu Glu Val Gin Gin 
65 70 . - 75 ■ 80 



Glu Leu Tyr 



<210> 87 

<211> 37 

<212> DNA 

<213> Artificial Sequence 
<220> ... 

<223> Primer » • 

<400> 87 . 

caggaaacag ctatgacacc aagaacaagg ctctcca 37 

<210> 88 ' . ^ • 

<211> 37 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer • 

<400> 88 . ' 

caggaaacag ctatgacgat agggcctctt ccacaga 37 

<210> 89" 

<211> 496 

<212> DNA 

<213> Severe acute respiratory syndrome virus 



<220> 

<221> inisc_f eature 

<222> (11).. (11) 

<223> n is a, c, q, or t • 

<400> 89 

acctacccag ngaaaagcca accaacctcg atctcttgta gatctgttct ctaaacgaac 60 
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. tttaaaatct gtgtagctgt cgctcggctg catgcctagt gcacctacgc agtataaaca 120 

ataataaatt ttactgtcgt tgacaagaaa cgagtaactc gtccctcttc tgcagactgc 180 

ttacggtttc gtccgt'gttg cagtcgatca tcagcatacc taggtttcgt ccgggtgtga 240 

ccgaaaggta agatggagag ccttgttctt ggtgtcaacg agaaaacaca cgtccaactc 300 

agtttgcctg ■ tccttcaggt tagagacgtg ctagtgcgtg gcttcgggga ctctgtggaa 360 

gaggccctat cggaggcacg tgaacacctc aaaaatggca cttgtggtct agtagagctg 420 

gaaaaaggcg tactgcccca gcttgaacag ccctatgtgt tcattaaacg 'ttctgatgcc 480 

ttaagcacca atcacg 4 96 

<210> 90 ■ 
<211> 523 
<212> DNA 

<213> Severe acute respiratory syndrome virus 



<400> 90 



gtcgacaaca 


atttctgtgg 


cccagatggg 


taccctcttg 


attgcatcaa 


agattttctc 


60 


gcacgcgcgg 


gcaagtcaat 


gtgcactctt 


tccgaacaac 


ttgattacat 


cgagtcgaag 


120 


ag'aggtgtct 


actgctgccg 


tgaccatgag 


catgaaattg 


cctggttcac 


tgagcgctct • 


■ 180 


gataagagct acgagcacoa gacacccttc gaaattaaga gtgccaagaa atttgacact 




ttcaaagggg 


aatgcccaaa 


gtttgtgttt 


cctcttaact 


caaaagtcaa 


agtcattcaa 


300 


ccacgtgttg 


aaaagaaaaa 


gactgagggt 


ttcatggggc 


gtatacgctc 


tgtgtaccct 


360 


gttgcatctc 


cacaggagtg 


taacaatatg 


cacttgtcta 


ccttgatgaa 


atgtaatcat 


420 


"tgcgatgaag tttcatggca 


gacgtgcgac 


tttctgaaag 


ccacttgtga 


acattgtggc 


480 


actgaaaatt 


tagttattga 


aggacctact 


acatgtgggt 


acc 




523 


<210> 91 
<211> .324 
<212> DNA 

<213> Severe acute respiratory syndrome virus 






<400> 91 
cttaggtgac 


gagcttggca 


ctgatcccat 


' tgaagattat 


gaacaaaact 


ggaacactaa 


60 


gcatggcagt 


ggtgcactcc 


gtgaactcac 


tcgtgagctc 


aatggaggtg 


cagtcactcg 


120 


ctatgtcgac 


aacaatttct 


gtggcccaga 


tgggtaccct 


cttgattgca 


tcaaagattt 


180 


tctcgcacgc 


gcgggcaagt 


caatgtgcac 


tctttccgaa 


caacttgatt 


acatcgagtc 


240 


gaagagaggt 


gtctactgct 


gccgtgacca 


tgagcatgaa 


attgcctggt 


tcactgagcg 


300 


ctcctgataa 


gagctacgag 


cacc 








324 



184 



wo 2004/096842 



PCT/CA2004/000626 



<210> 92 . ' • ' ■ 

<211> 4 95 . " . . , . 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
■<400> 92 

tgctataata agcgtgccta ctgggttcct cgtgctagtg ctgatattgg gctcaggcca- 60 
tactggcatt actggtgaca atgtggagac cttgaatgag gatctccttg agatactgag 120 
"tcgtgaacgt gttaacatta acattgttgg cgattttcat ttgaatgaag aggttgccat 180 
cattttggca tctttctctg cttctacaag tgcctttatt gacactataa agagtcttga 24 0 
ttacaagtct ttcaaaacca ttgttgagtc ctgcggtaac tataaagtta ecaagggaaa 300 
gcccgtaaaa ggtgcttgga acattggaca acagagatca gttttaacac cactgtgtgg 360 
ttttccctca caggctgctg g'tgttatcag .atcaattttt gcgcgcacac ttgatgcagc 420 
aaaccactca attcctgatt tgcaaagagc agctgtcacc atacttgatg gtatttctga 480 
acagtcatta cgtct 4 95 

<210> 93 . 

<211> 486 • 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

* . \ . 

<400> 93 

gccactcaaa cattgaaact cgactccgca agggaggtag gactagatgt tttggaggct 60 

gtgtgtttgc ctatgttggc tgctataata agcgtgccta ctgggttcct cgtgctagtg 120 

ctgatattgg ctcaggccat actggcatta ctggtgacaa tgtggagacc ttgaatgagg 180 

atctccttga gatactgagt cgtgaacgtg ttaacattaa cattgttggc gattttcatt 24 0 

tgaatgaaga . ggttgccatc attttggcat ctttctctgc ttctacaagt gcctttattg 300 

acactataaa gagtcttgat tacaagtctt tcaaaaccat tgttgagtcc tgcggtaact 360 

ataaagttac caagggaaag cccgtaaaag gtgcttggaa cattggacaa cagagatcag 420 

ttttaacacc actgtgtggt tttccctcac agg'ctgctgg. tgttatcaga tcaatttttg 480 
cgcgca '486 

<210> 94 
<211> 567 

<212> DKA 

<213> Severe acute respiratory syndrome virus 
<400> 94 

cactactgtg gaaaaactca ggcctatctt tgaatggatt gaggcgaaac ttagtgcagg 60 
agttgaattt ctcaaggatg cttgggagat tctcaaattt ctcattacag gtgtttttga 120 
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catcgtcaag ggtcaaatac • aggttgcttc . agataacatc aaggattgtg taaaatgctt 180 

cattgatgtt gttaacaagg cactcgaaat gtgcattgat caagtcacta tcgctggcgc 24 0 

aaagttgcga tcactcaact taggtgaagt cttcatcgct caaagcaa.gg gactttaccg * 300 

tcagtgtata cgtggcaagg agcagctgca actactcatg cctcttaagg caccaaaaga' 360 

agtaaccttt cttgaaggtg attcacatga cacagtactt acctctgagg. aggttgttct *420 

caagaacggt gaactcgaag cactcgagac gcccgttgat agcttcacaa atggagctat 480 

cgttggcaca ccagtctgtg taaatggcct catgctctta gagattaagg acaaagaaca 54 0 

atactgcgca ttgtctcctg gtttact 567 

<210> 95 • 

<211> 516 
<212> DNA 

<213> ^ Severe acute respiratory syndrome -virus 
<400> 95 

gggagattct caaatttctc attacaggtg 'tttttga'cat cgtcaagggt caaatacagg 60 

ttgcttcaga taacatcaag gattgtgtaa aatgcttcat tgatgttgtt aacaaggcac 120 

tcgaaatgtg cattgatcaa gtcactatcg ctggcgcaaa gttgcgatca ctcaacttag ' 180 

* 

gtgaagtctt catcgctcaa agcaagggac tttaccgtca gtgtatacgt ggcaaggagc 240 

agot<3caact actcatgcct cttaaggcac caaaagaagt aacctttctt gaaggtgatt 300 

cacatgacac agtacttacc .tctgaggagg ttgttctcaa gaacggtgaa ctcgaagcac 360 

' tcgagacgcc cgttgatagc ttcacaaatg gagctatcgt tggcacacca gtctgtgtaa 420 

atggcctcat gctcttagag attaaggaca aagaacaata ctgcgcattg tctcctggtt 480 

tactggctac aaacaatgtc tttcgcttaa aagggg 516 

<210> 96 ' - . 

<211> 448 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 96 ' 

agttcgagtt gaggaagaag aagaggaaga ctggctggat gatactactg agcaatcaga . 60 

gattgagcca gaaccagaac ctacacctga agaaccagtt aatpagttta ctggttattt 120 

aaaacttact gacaatgttg ccattaaatg tgttgacatc gttaaggagg cacaaagtgc . 180 

taatcctatg gtgattgtaa atgctgctaa catacacctg aaacatggtg gtggtgtagc 240 

aggtgcactc aacaaggcaa ccaatggtgc catgcaaaag gagagtgatg attacattaa 300 

gctaaatggc cctcttacag taggagggtc ttgtttgctt tctggacata atcttgctaa 360 

gaagtgtctg catgttgttg gacctaacct aaatgcaggt gaggacatcc agcttcttaa 420 
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ggcagcatat gaaaatttca at'tcacag 4 48 

<210> 97 " ' " 

<211> 333 

<212> DNA 

<213>*. Severe acute respiratory syndrome virus 
<400> 91 ' . ' . 

agaggatgat tatcaaggtc tccctctgga atttggtgcc tcagctgaaa* cagttcgagt 60 
tgaggaagaa gaagaggaag actggctgga tgatactact gagcaatcag agattgagcc 120 
agaaccagaa cctacacctg aagaaccagt taatcagttt actggttatt taaaacttac 180 
tgacaatgtt gccattaaat gtgttgacat cgttaaggag gcacaaagtg ctaatcctat 240 
ggtgattgta aatgctgcta acatacacct gaaacatggt ggtggtgtag caggtgcact 3.00 
caacaaggca accaatggtg ccatgcaaaa gga 333 

<210> 98 , • ' ' ■ 

. <211> 399 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 98 " - . ■ • 

gagatgctct caagagcttt gaagaaagtg ccagttgatg agtatataac cacgtaccct 60 

ggac^^aggat gtgctggtta tacacttgag gaagctaaga ctgctcttaa gaaatgcaaa 120 

tctgcatttt atgtactacc ttcagaagca cctaatgcta ag'gaagagat tctaggaact 180 

gtatcctgga atttgagaga aatgc5ttgct catgctgaag agacaagaaa attaatgcct .240 

atatgcatgg atgttagagc cataatggca accatccaac gtaagtataa aggaattaaa 300 

attcaagagg gcatcgttga ctatggtgtc cgattcttct tttatactag taaagagcct 360 

gtagcttcta ttattacgaa gctgaactct ctaaatgag 399 

<210> 99 
<211> 437 
<212> DNA- 

<213> Severe acute respiratory syndrome Virus 
<400> 99 

agaaatctgt cgtacagaag cctgtcgatg tgaagccaaa aattaaggcc tgcattgatg 60 
aggttaccac aacactggaa gaaactaagt ttcttaccaa taagttactc ttgtttgctg 120 
atatcaatgg taagctttac catgattctc agaacatgct tagaggtgaa gatatgtctt 180 
tccttgagaa ggatgcacct tacatggtag gtgatgttat cactagtggt gatatcactt 24 0 
gtgttgtaat accctccaaa aaggctggtg gcactactga gatgctctca agagctttga 300 
agaaagtgcc agttgatgag tatataacca cgtaccctgg acaaggatgt gctggttata 360 
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cacttgagga agctaagact gctcttaaga aatgcaaatc tgcattttat gtactacctt 420 
cagaagcacc taatgct 437 

<210> 100 
<211> 569 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 100 

cctctatcgt attgacggag ctcaccttac aaagatgtca .gagtacaaag gaccagtgac 60 
tgatgttttc tacaaggaaa catcttacac tacaaccatc aagcctgtgt cgtataaact 120 
cgatggagtt acttacacag agattgaacc aaaattggat gggtattata aaaaggataa 180 
tgcttactat acagagcagc ctatagacct tgtaccaact caaccattac caaatgcgag 240 
ttttgataat ttcaaactca catgttctaa cacaaaattt gctgatgatt taaatcaaat 300 
gacaggcttc acaaagccag cttcacgaga gctatctgtc acattcttcc cagacttgaa 360 
tggcgatgta gtggctattg actatagaca ctattcagcg agtttcaaga aaggtgctaa 420 
attactgcat aagccaattg tttggcacat taaccaggct acaaccaaga caacgttcaa 480 
accaaacact tggtgtttac gttgtctttg gagtacaaag ccagtagata cttcaaattc 540 

atttgaagtt ctggcagtag aagacacat • • 569 

i 

<210> 101 
<211> 187 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 101 

tcagcagata cttcaaattc atttgaagtt ctggcagtag aagacacaca aggaatggac 60 
aatcttgctt gtgaaagtca acaacccacc tctgaagaag tagtggaaaa tcctaccata 120 
cagaaggaag tcatagagcg tgacgtgaaa actaccgaag ttgtaggcaa tgtcatactt 180 
aaaccat 187 

<210> 102 • 
<211> 271 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 102 

aaatgcgacg agtctgcttc taagtctgct tctgtgtact acagtcagct gatgtgccaa 60 
cctattctgt tgcttgacca agctcttgta tcagacgttg gagatagtac tgaagtttcc 120 
gttaagatgt ttgatgctta tgtcgacacc ttttcagcaa cttttagtgt tcctatggaa 180 
aaacttaagg cacttgttgc tacagctcac agcgagttag caaagggtgt agctttagat 24 0 
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. ggtgtccttt ctacattcgt gtcagctgcc c 271 

<210> 103 ' •* 

<211> 363 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 103- 

catttcatca gcaattcttg gctcatgtgg tttatcatta gtattgtaca * aatggcaccc 60 
gtttctgcaa tggttaggat gtacatcttc tttgcttctt tctactacat atggaagagc 120 
tatgttcata tcatggatgg ttgcacctct tcgacttgca tgatgtgcta taagcgcaat 180 
cgtgccacac gcgttgagtg tacaactatt gttaatggca tgaagagatc tttctatgtc 240 
• ^ tatgcaaatg gaggccgtgg cttctgcaag actcacaatt ggaattgtct caattgtgac 3.00 
acattttgca- ctggtagtac attcattagt gatgaagttg ctcgagattt gtcactccag 360 

ttt ■ 363 

i ■ ■ ■ ' 

<210> 104 . 
<211> 500 
<212> DNA 

<213> Severe acute respiratory syndrome virus • ' 

<400> 104 

agag^tcttg gcgcatgtat tgactgtaat gcaaggcata tcaatgccca aggtagcaaa. -60 

aagtcacaat gtttcactca tctggaatgt aaaagactac atgtctttat ctgaacagct • 120 

• gcgtaaacaa attcgtagtg ctgccaagaa gaacaacata ccttttagac taacttgtgc 180 

tacaactaga caggttgtca atgtcataac tactaaaatc tcactcaagg gtggtaagat 240 

tgttagtact tgttttaaac ttatgcttaa ggccacatta ttgtgcgttc ttgctgcatt 300 

ggtttgttat atcgttatgc cagtacatac attgtcaatc catgatggtt acacaaatga 360 

aatcattggt tacaaagcca ttcaggatgg tgtcactcgt gacatcattt ctactgatga 420 

ttgttttgca aataaacatg ctggttttga cgcatggttt agccagcgtg gtggttcata 480 
caaaaatgac aaaagctgcc 500 

<210> 105 
<211> 537 
<212> DNA 

<213> Severe acute respiratory, syndrome virus 
<400> 105 

cattgtcaat ccatgatggt tacacaaatg aaatcattgg ttacaaagcc attcaggatg 60 
gtgtcactcg tgacatcatt tctactgatg attgttttgc aaataaacat gctggttttg 120 
acgcatggtt tagccagcgt ggtggttcat acaaaaatga caaaagctgc cctgtagtag 180 
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ctgctatcat tacaagagag attggtttca tagtgcctgg cttacdgggt actgtgctga 24 0 

gagcaatcaa tggtgacttc ttgcattttc tacctcgtgt t'tttagtgct gttggcaaca * 300 

tttgctacac accttccaaa ctcattgagt atagtgattt tgctacctct gcttgcgttc 360 

ttgctgctga gtgtacaatt tttaaggatg ctatgggcaa acctgtgcca tattgttatg' 420 

acactaattt gctagagggt tctatttctt atagtgagct tcgtccagac actcgttatg 480 

tgcttatgga tggttccatc atacagtttc ctaacactta cctggagggg tctgtta 537 

<210> 106 
<211> 427 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 106 

cacttttgtt tttgatgtct ttcactatac tctgtctggt accagcttac agctttctgc 60 

cgggagtcta ctcagtcttt tacttgtact tgacattcta tttcaccaat gatglttcat 120 

tcttggctca ccttcaatgg tttgccatgt tttctcctat t'gtgcctttt tggataacag 180 

caatctatgt attctgtatt tctctgaagc actgccattg gttctttaac -aactatctta * 240 

ggaaaagagt catgtttaat ggagttacat ttagtacctt cgaggaggct gctttgtgta 300 

cctttttgct • caa'caaggaa atgtacctaa aattgcgtag cgagacactg ttgccactta 360 

cacagtataa caggtatctt gctctatata acaagtacaa gtatttcagt ggagccttag 420 

atactac ^ 427 

<210> 107 
<211> 537 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 107 

agtaacaact tttgatgctg agtactgtag acatggtaca tgcgaaaggt cagaagtagg " 60 

tatttgccta tctaccagtg gtagatgggt tcttaataat gagcattaca gagctctatc 120 

aggagttttc tgtggtgttg atgcgatgaa tctcatagct aacatcttta ctcctcttgt 180 

gcaacctgtg ggtgctttag atgtgtctgc ttcagtagtg gctggtggta ttattgccat 240 

attggtgact tgtgctgcct actactttat gaaattcaga cgtgtttttg gtgagtacaa ' 300 

ccatgttgtt gctgctaatg cacttttgtt tttgatgtct ttcactatac tctgtctggt 360 

accagcttac agctttctgc cgggagtcta ctcagtcttt tacttgtact tgacattcta 420 

tttcaccaat gatgtttcat tcttggctca ccttcaatgg tttgccatgt tttctcctat 480 

tgtgcctttt tggataacag caatctatgt attctgtatt tctctgaagc actgcca 537 
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'<210>* 108 

<211> 551 . ' 

<212> DNA 

<213> Severe acute respiratory syndrome virus 



<400> 108 



agtatactgt 


ccaagacatg 


tcatttgcac agcagaagac 


atgcttaatc 


ctaactatga 


60 


agatctgctc" 


attcgcaaat 


ccaaccatag ctttcttgtt 


caggctggca 


atgttcaact 


.120 


tcgtgttatt 


ggccattcta 


tgcaaaattg tctgcttagg cttaaagttg 


atacttctaa 


180 


ccctaagaca 


cccaagtata 


aatttgtccg tatccaacct 


ggtcaaacat 


tttcagttct 


240 


agcatgctac 


aatggttcac catctggtgt ttatcagtgt 


gccatgagac 


ctaatcatac 


300 


cattaaaggt 


tctttcctta 


atggatcatg tggtagtgtt 


ggttttaaca 


ttgattatga 


360 


ttgcgtgtct 


ttctgctata 


tgcatcatat ggagcttcca 


acaggagtac 


acgctggtac 


420 


tgacttagaa 


ggtaaattct 


atggtccatt tgttgacaga 


caaactgcac 


aggctgcagg 


480 


tacagacaca 


accataacat 


taaatgtttt ggcatggctg 


tatgctgctg 


ttatcaat'gg 


540 


tgataggtgg 


t 








' 551 


<210> 109 
<211> 593 
<212> DNA 

<213^ Severe acute respiratory syndrome virus ' 


1 




<400> 109 
acttagcaaa 


ggctctaaat 


gactttagca actcaggtgc 


tgatgttctc 


taccaaccac 


60 


cacagacatc 


aatcacttct 


gctgttctgc agagtggttt 


taggaaaatg 


gcattcccgt 


120 


paggcaaagt 


tgaagggtgc atggtacaag taacctgtgg 


aactacaact 


cttaatggat 


180 


tgtggttgga 


tgacacagta 


tactgtccaa gacatgtcat 


ttgcacagca 


aaaaacal" crc 


240 


ttaatcctaa 


ctatgaagat 


ctgctcattc gcaaatccaa 


ccatagcttt 


cttgttcagg 


300 


ctggcaatgt 


tcaacttcgt 


gttattggcc attctatgca 


aaattgtctg 


cttaggctta 


360 


aagttgatac 


ttctaaccct 


aagacaccca agtataaatt 


tgtccgtatc 


caacctggtc 


420 


aaacattttc 


agttctagca 


tgctacaatg gttcaccatc 
aaaggttctt tccttaatgg 


tggtgtttat 


cagtgtgcca . 


480 


tgagacctaa 


tcataccatt 


atcatgtggt 


agtgttggtt 


540 


ttaacattga 


ttatgattgc 


gtgtctttct gctatatgca 


tcatatggag 


ctt 


593 



<210> 110 

<211> 504 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 110 
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tgtgctgctt 


tgaaagagct 


gctgcagaat 


gggtatgaat ggtcgtacta tccttggtag 


60 


cactatttta 


gaagatgagt 


ttacaccatt 


tgatgttgtt agacaatgct 


ctggtgttac 


12 Q 


cttccaaggg 


taagttcaag 


aaaattgtta 


agggcactca tcattggatg cttttaactt 


180 


tcttga'catc 


actattgatt cttgttcaaa gtacacagtg gtcactgttt ttctttgttt 


240 


acgagaatgc 


tttcttgcca 


tttactcttg gtattatggc aattgctgca 


tgtgctatgc 


300 


tgcttgttaa 


gcataagcac 


gcattcttgt 


gcttgtttct gttaccttct 


cttgcaacag 


360 


L. ugCT- L, aC L L. 


taatatggtc tacatgcctg ctagctgggt gatgcgtatc 


atgacatggc 


420 


ttgaattggc 


tgacactagc 


ttgtctggtt 


ataggcttaa ggattgtgtt 


atgtatgctt 


480 


cagctttagt 


tttgcttatt 


ctca 






504 


<210> 111 . ' 
<211> 298 

<2i2> urn- 

<213> Severe acute respiratory syndroiae virus 






<400> 111 

taggcttaag gattgtgtta tgtatgcttc agctttagtt ttgcttattc tcatgacagc 


60 


tcgcactgtt 


tatgatgatg 


ctgctagacg 


tgtttggaca- ctgatgaatg 


tcattacact 


120 


tgtttacaaa 


gtctactatg 


gtaatgcttt 


agatcaagct. atttccatgt 


gggccttagt 


180 


tatttctgta 


acctctaeLct. 


a-t-tctgg-tgt 


cgttacgact atca'tgtttt 


tagctagagc 


240 


tatagtgttt 


gtgtgtgttg 


agtattaccc 


attgttattt attacctggc 


aacacctt 


298 


<210> 112 
<211> 530 

<212> DNA 

<213> Severe acute respiratory syndrome virus 






<400> 112 
aaacaggcaa 


gatctgagga 


caagagggca 


aaagtaacta gtgctatgca 


aacaatgcte 


60 


ttcactatgc 


ttaggaagct 


tgataatgat 


gcacttaaca acattatcaa 


caatgcgcgt 


120 


gatggttgtg 


ttccactcaa 


catcata'cca 


ttgactacag cagccaaacf 


catggttgtt 


180 


gtccctgatt 


atggtaccta 


caagaacact 


tgtgatggta acacctttac 


atatgcatct ' 


240 


gcactctggg aaatccagca 


agttgttgat 


gcggatagca agattgttc^ 


acttagtgaa 


300 


attaacatgg 


acaattcacc 


aaatttggct 


tggcctctta ttgttacagc 


tctaagagcc 


360 


aactcagctg 


ttaaactaca 


gaataatgaa 


ctgagtccag tagcactacg 


acagatgtcc 


420 


tgtgcggctg 


gtaccacaca 


aacagcttgt 


actgatgaca atgcacttgc 


ctactataac 


480 


aattcgaagg 


gaggtaggtt 


tgtgctggca 


ttactatcag accaccaagc 




530 
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<210> 113 

<211> 605^ . ' ' ■ 

<212> DNA . ' . - 

.<213>- Severe acute respiratory syndrome vlrMS 



<4"00> 113 
gaagtcgttc 


tcaaaa.agtt 


aaagaaatct 


ttgaatgtgg ctaaatctga gtttgaccgt 


60 


gatgctgcca 


tgcaacgcaa 


gttggaaaag 


atggcagatc 


aggctatgac 


ccaaatgtac 


120 


aaacaggcaa 


gatctgagga 


caagagggca 


aaagtaacta 


gtgctatgca 


aacaatgctc 


180 


ttcactatgc 


ttaggaagct 


tgataatgat 


gcacttaaca acattatcaa 


caatgcgcgt 


240 


g a u g gx- 1- gi- g 




catcatacca 


ttgactacag 


cagccaaact 


catggttgtt * 


300 


gtccctgatt 


atggtaccta 


caagaacact 


tgtgatggta 


acacctttac 


atatgcatct 


360 


gcactctggg 


aaatccagca 


agttgttgat 


gcggatagca 


agattgttca 


acttagtgaa 


420 


attaacatgg 


acaattcacc 


aaatttggct 


tggcctctta 


ttgttacagc 


tctaagagcc 


480 


aactcagctg 


ttaaactaca 


gaataatgaa 


ctgagtccag 


tagcactacg 


acagatgtcc 


540 


tgtgcggctg 


gtaccaca<::ja 


aacagcttgt 


actgatgaca 


atgcacttgc 


ctactataac 


600 



aattc 505 
<210> 114 

<211> 176 . • \ 

<2i2> mPi 

<213> Severe acute respiratory syndrome virus 
<40Q> 114 

acactggtac aggacaggca attactgtaa caccagaagc taacatggac caagagtcct 60 
ttggtggtgc ttcatgttgt ctgtattgta ,gatgccacat tgaccatcca aatcctaaag 120 
gattctgtga cttgaaaggt aagtacgtcc aaatacctac cacttgtgct aatgat 176 

<210> 115 

<211> 516 
<212> mi^ 

<:213> Severe acute respiratory syndrome virus 
<400> 115 

actgt^acac cagaagctaa catggaccaa gagtcctttg gtggtgcttc atgttgtctg' 60 
tattgtagat gccacattga ccatccaaat cctaaaggat tctgtgactt gaaaggtaag 120 
tacgtccaaa tacctaccac ttgtgctaat gacccagtgg gttttacact tagaaacaca 180 
gtctgtaccg tctgcggaat gtggaaaggt tatggctgta gttgtgacca actccgcgaa 240 
cccttgatgc agtctgcgga tgcatcaacg tttttaaacg ggtttgcggt gtaagtgcag 300 
cccgtcttac accgtgcggc acaggcacta gtactgatgt cgtctacagg gcttttgata 360 
tttacaacga aaaagttgct ggttttgcaa agttcctaaa aactaattgc tgtcgcttcc 420 
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aggagaagga tgaggaaggc aatttattag actcttactt tgtagttaag aggcatacta 480 
tgtctaccta ccaacatgaa gagactattt ataact 516 

<210> 116 
<211>* 366 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 116 

accacttatt aagtgggatt tgctgaaata tgattttacg gaagagagac tttgtctctt 60 
cgaccgttat tttaaatatt gggaccagac ataccatccc aattgtatta actgtttgga 120 
tgataggtgt atccttcatt gtgcaaactg taatgtgtta ttttctgctg tgtttccacg 180 
tacaagtttt ggaccactag taagaaaaat atttgtagat ggtgttcctt ttgttgtttc 24 0 
aactggatac cattttcgtg agttaggagt cgtacataat caggatgtaa acttacatag 300 
ctcgcgtctc agtttcaagg aacttttagt gtatgctgct gatccagcta tgcatgcagc 360 

ttotqq -366 

> 

<210> 117 

<211> 291 . . . 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> ill 

tgaaaaagtt gctggttttg caaagttcct aaaaactaat tgbtgtcgct tccaggagaa , 60 
ggatgaggaa ggcaatttat tagactctta ctttgtagtt aagaggcata ctatgtctaa 120 
ctaccaacat gaagagacta tttataactt ggttaaagat tgtccagcgg ttgctgtcca 180 
tgactttttc aagtttagag tagatggtga catggtacca catatatcac gtcagcgtct 240 
aactaaatac acaatggctg atttagtcta tgctctacgt cattttgatg a • ■ 291 

<210> 118 

<211> 480 

<212> DNA . - * 

<213> Severe acute respiratory syndrome virus 

<400> 118 

gagtcccata tggatgctga tctcgcaaaa ccacttatta agtgggattt gctgaaatat 60 

gattttacgg aagagagact ttgtctcttc gaccgttatt ttaaatattg ggaccagaca 120 

taccatccca attgtattaa ctgtttggat gataggtgta tccttcattg tgcaaacttt 180 

aatgtgttat tttctactgt gtttccacct acaagttttg gaccactagt aagaaaaata 240 

tttgtagatg gtgttccttt tgttgtttca actggatacc attttcgtga gttaggagtc 300 

gtacataatc aggatgtaaa cttacatagc tcgcgtctca gtttcaagga acttttagtg 360 
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tatgctgctg atccagctat gcatgcagpt tctggcaatt tattgct^ga taaacgcact 420 
acatgctttt cagtagctgc actaacaaac aatgttgctt ttcaaactgt caaacccggt 480 

<210> • 119 , • • - ' 

<211> 405 . , . 

<212> DNA ' 

<213> Severe acute respiratory syndrome virus 

<400> 119- 

aatgggaact ggtacgattt cggtgatttc gtacaagtag .caccaggctg cggagttcct 60 
attgtggatt catattactc attgctgatg c/ccatcctca otttgactag ggcattggct 120 
gctgagtccc atatggatgc tgatctcgca aaaccactta ttaagtgaga tttgctgaaa 180 
tatgatttta cggaagagag actttgtctc ttcgaccgtt attttaaata ttgggaccag 240 
acataccatc ccaattgtat taactgtttg gatgataggt gtatccttca ttgtgcaaac 300 
tttaatgtgt tattttctac tgtgtttcca cctacaagct ttggaccact agtaagaaaa 3 60 
atatttgtag atggtgttcc ttttgttgtt tcaactggat accat 405 

<210> 120 

<211> 562 

<212> Dm 

<213> Severe acute respiratory syndrome virus 

<220> ■ . ' 

<221> misc_feature ' • ' 

<222> (67) . . (67} " 

<223> n is a, C/ g, or t 

<400> 120 



ctattgatgc 


ttacccactt 


acaaaacatc 


ctaatcagga 


gtatgctgat 


gtctttcact 


60 


tgtattnaca 


atacattaga 


aagttacatg 


atgagcttac 


tggccacatg 


ttggacatgt 


120 


attccgtaat 


gctaactaat 


gataacacct 


cacggtactg 


ggaacctgag ttttatgagg 


180 


ctatgtacac 


accacataca 


gtcttgcagg 


ctgtaggtgc 


ttgtgtattg 


tgcaattcac 


240 


agacttcact tcgttgcggt 


gcctgtatta 


ggagaccatt 


ectatgttgc 


aagtgctgct 


300 


atgaccatgt 


catttcaaca 


tcacacaaat 


tagtgttgtc 


tgttaatccc 


tatgtttgca 


360 


atgccccagg ttgtgatgtc 


actgatgtga 


cacaactgta 


tctaggaggt 


atgagctatt 


420 


attgcaagtc 


acataagcct 


cccafctagtt 


ttccattatg 


tgctaatggt 


caggtttttg 


480 


gtttatacaa 


aaacacatgt 


gtaggcagtg 


acaatgtcac 


tgacttcaat 


gcgatagcaa 


540 


catgtgattg 


gactaatgct 


gg 








562 



<210> 121 
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<211> 580 

<212>' DNA . - • • 

<213> Severe acute, respiratory syndrome, virus 

<400> 121 ' " 



gctatgtaca 


caccacatac 


agtcttgcag 


gctgtaggtg 


cttgtgtatt gtgcaattca 


€0 


cagacttcac 


ttcgttgcgg 


tgcetgtatt 


aggagaccat tcctatgttg caagtgctgc 


120 


tatgaccatg 


t.ca-tt.t.caac. 


atcacacaaa 


ttagtgttgt 


ctgttaatcc ctatgtttgc 


180 


aatgccccag 


gttgtgatgt 


cactgatgtg 


acacaactgt 


atctaggagg tatgagctat 


240 


tattgeaagt 


cacataagcc 


tcccattagt 


tttccattat 


gtgctaatgg tcsiggttt-tt. 


. 300 


ggtttataca 


aaaacacatg 


tgtaggcagt 


gacaatgtca 


ctgacttcaa tgcgatagca 


360 


acatgtgatt 


ggactaatgc 


tggcgattac 


atacttgcca 


acacttgtac tgagagactc 


420 


aagcttttcg 


cagcagaaac 


gctcaaagcc 


actgaggaaa 


catttaagct gtcatatggt 




attgccactg 


tacgcgaagt 


actctctgac 


agagaattgc 


atctttcatg ggaggttgga 


540 


aaacctagac 


c^Gcattgaa 


cagaaactat 


gtctttactg 




5*80 


<210> 122 
<211> 610 

<2l2> DNA • . * 
<213> Severe acute respiratory syndrome "virus 




<400V' 12.2 
tggtgatgct 


gttgtgtaca 


gaggtactac 


gacatacaag 


ttgaatgttg gtgattactt 


60 


tgtgttgaca 


tctcacactg 


taatgccact 


tagtgcacct 


actctagtgc cacaagagca 


120 


ctatgtgaga 


attactggct tgtacccaac actcaacatc tcaqatgagt tttctagcaa 


180 


tgttgcaaat 


tatcaaaagg 


tcggcatgca 


aaagtactct 


acactccaag gaccacctgg 


240 


tactggtaag 


agtcattttg 


ccatcggact 


tgctctctat 


tacccatctg ctcgcatagt. 


300 


gtatacggca 


tgctctcatg. 


cagctgttga 


tgccctatgt 


gaaaaggcat taaaatattt 


360 


gcGcatagat 


aaatgtagta 


gaatcatacc 


tgcgcgtgcg 


cgcgtagagt gttttgataa 


420 


attcaaagtg 


■ aattcaacac 


tagaacagta 


tgttttctgc 


actgtaaatg cattgccaga 


480 


aacaactgct 


gacattgtag tctttgatga 


aatctctatg 


gctactaatt atgacttgag . 


. 540 


tgttgtcaat 


gctagacttc 


gtgcaaaaca 


ctacgtctat 


attggcgatc ctgctcaatt 


600 


accagcccct 










610 



<210> 123 

<211> 429 

<212> DNA 

<213> Severe acute respiratory syr\dxoine virus 

<400> 123 
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ccaacactca acatctcaga tgagttttct agcaatgttg ca'aattatca aaaggtcggc '60 

atgcaaaagt actctacact ccaaggacca cctggtactg gtaagagtca ttttgccatc 120 

ggacttgctc tctattaccc atctgctcgc atagtgtata cggcatgctc tcatgcagct 180 

gttgatgccc tatgtgaaaa ggcattaaaa tatttgccca tagataaatg tagtagaatc 240 

atacctgcgc gtgcgcgcgt agagtgtttt gataaattca aagtgaattc aacactagaa 300 

cagtatgttt tctgcactgt aaatgcattg ccagaaacaa ctgctgacat tgtagtcttt 360 

gatgaaatct ctatggctac taattatgac ttgagtgttg tcaatgctag acttcgtgca 420 
aaacactac 

<210> ' 124 ' * 

<211> 486 
<212> DNA 

<213> Severe aeut.e re^splratory syndrome virus 
<400> 124 

•caatgtggct atcacaaggg caaaaattgg cattttgtgc ataatgtctg atagagatct 60 

ttatgacaaa ctgcaattta caagtctaga aataccacgt cgcaatgtgg ctacattaca ' 120 

agcagaaaat gtaactggac "tttttaagga ctgtagtaag atcattactg gtcttcatcc 180 

tacacaggca cctacacacc tcagcgttga tataaagttc aagactgaag gattatgtgt 240 

tgacatacca qqca-taooaa a<9^Bi<c;Bi'tgac c^accgtaga " ctcatctcta tgatgggttt 300 

caaaatgaat tacdaagtca atggttaccc taatatgttt atcacccgcg aagaagctat 360 

tcgtcacgtt cgtgcgtgga ttggctttga tgtagagggc tgtcatgcaa ctagagatgc 4 20 

tgtgggtact aacctacctc tccagctagg .attttctaca ggtgttaact tagtagctgt 4 80 

accgac > 48 5 

<2X0> 125 
<211> 427' 
<212> DMA 

■<213> Severe acute respiratory syndrome virus 
<400> 125 ' 

aaaggacatg acctaccgta gactcatctc tatgatgggt ttcaaaatga attaccaagt 60 

caatggttac cctaatatgt ttatcacccg cgaagaagct attcgtcacg ttcgtgcgtg . 120 

gattggcttt gatgtagagg gctgtcatgc aactagagat gctgtgggta ctaacctacc 180 

tctccagcta ggattttcta caggtgttaa cttagtagct gtaccgactg gttatgttga 240 

cactgaaaat aacacagaat tcaccagagt taatgcaaaa cctccaccag gtgaccagtt 300 

taaacatctt ataccactca tgtataaagq cttgcGCt<g<3 aatgtag-tgc gtattaagat 360 

agtacaaatg ctcagtgata cactgaaagg attgtcagac agagtcgtgt tcgtcctttg 420 
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ggcgcat 427 

<210> 126 
<211> ' 392 
<212> DNA ' 

<213>' Severe acute respiratory syndrome virus 



<400> 126' 



atggaaatgc acatgtggct 


agttgtgatg 


ctatcatgac 


tagatgttta' gcagtccatg 


60 


agtgctttgt 


taagcgcgtt 


gattggtctg 


ttgaataccc 


tattatagga gatgaactga 


120 


gggttaattc tgcttgcaga 


aaagtacaac 


acatggttgt 


gaagtctgca ttgcttgctg 


180 


ataagtttcc 


agttcttcat 


gacattggaa 


atccaaaggc 


tatcaagtgt gtgcctcagg 


240 


ctga^gtaga 


atggaagttc 


tacgatgctc 


agccaitgtag tgacaaagct tacaaaatag 


3.00 


aggaactctt 


ctattcttat 


gctacacatc 


acgataaatt 


cactgatggt gtttgtttgt 


360 


tttggaattg 


t^acgttgat 


cgttacccag 


CG 




'392 


<210> 127 , • • 
<211> 4 83 . 
<212> Dm 

<213> Severe acute respiratory syndrome virus * 




<40Q> 127 
gcttcatcag 




cattctgtgg 


gttttgacta tgtctataac 


. -60 


ccatttatga 


ttgatgttc?a 


gcagtggggc 


tttacgggta 


accttcagag taaccatgac . 


120 


caacattgcc 


aggtacatgg 


aaatgcacat 


gtggctagtt 


gtgatgctat catgactaga 


180 


tgtttagcag 


tccatgagtg 


ctttgttaag 


cgcgttgatt 


ggtctgttga ataccctatt. 


240 


ataggagatg 


aactgagggt 


taattctgct 


tgcagaaaag 


tacaacacat ggttgtgaag 


300 


tctgcattgc 


ttgctgataa 


gtttccagtt 


cttcatgaca 


ttggaaatcc aaaggctatc 


360 


aagtgtgtgc 


ctcaggctga 


agtagaatgg aagttctacg atgctcagcc atgtagtgac 


420 


aaagcttaca 


aaatagagga 


actcttctat 


tcttatgcta 


cacatcacga taaattcact 


480 

rX O \J 


gat 












<210> 128 ' 
<211> 326 
<212> DNA 

<213> Severe acute respiratory syndrome virus 




<400> 128 
tcaaagggac 


cagcacaagc 


tagcgtcaat 


ggagtcacat 


taattggaga at cagtaaaa 


60 


acacagttta 


actactttaa 


gaaagtagac 


ggcattattc 


aacagttgcc tgaaacctac 


120 


tttactcaga 


gcagagactt 


agaggatttt 


aagcccagat 


cacaaatgga aactgacttt 


180 
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ctcgagctcg ctatggatga attcatacag cgatataagc- tcgagggcta tgccttcgaa 240 
cacatcgttt atggagattt cagtcatgga caacttggcg gtcttce^ttt aatgataggc 300 
ttagccaagc gctcacaaga ttcact 326 

<210> 129 * ' - • 

<211> 457-. 
<212> DNA 

<213> Severe acute respiratory syndrome, virus 

<400> 129 ' ' , 

acaccttcaa agggaccagc acaagctagc gtcaatggag tcacattaat tggagaatca ' 60 

gtaaaaacac agtttaacta ctttaagaaa gtagacggca ttattcaaca gttgcctgaa 120 

acctacttta ctcagagcag agacttagag gattttaagc ccagatcaca aatggaaact 1.80 

gactttctcg agctcgctat ggatgaattc atacagcgat ataagctcga gggctatgcc 240 

ttcgaacaca tcgtttatgg agatttcagt catggacaac ttggcggtct tcatttaatg * 300 

. ataggcttag ccaagcgctc acaagattca ccacttaaat tagagg^ttt tatccctatg 360 

gacagcacag tgaaaaatta cttcataaca gatgcgcaaa caggtt^atc aaaatgtgtg 420 

tgttctgtga ttgatctttt acttgatgac tttgtcg - * 457 

<211> 493 

<212> DNH .• ' 

<213> Severe acute respiratory synciroirve virus 

<400> 130 

cgcaaagtat actcaactgt gtcaatactt aaatacactt actttagctg taccctacaa 60 

catgagagtt attcactttg gtgctggctc tgataaagga gttgcaccag gtacagctgt 120 

gctcagacaa tggttgccaa ctggcacact acttgtcgat tcagatctta atgacttcgt 180 

ctccgacgca gattctactt taattggaga ctgtgcaaca gtaca,tacgg ctaataaatg 240 

ggaccttatt attagcgata tgtatgac'cc taggaccaaa ' catgtgacaa aagagaatga 300 

ctctaaagaa gggtttttca cttatctgtg tggatttata aagcaaaaac tagccctggg 360 

tggttctata gctgtaa^ga taacagagca ttcttggaat gctgaccttt acaagcttat 420 

gggccatttc tcatggtgga cagcttttgt tacaaatgta aatgcatcat catcggaagc 480 

atttttaatt ggg 493' 

<210> 131 

<211> 490 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
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<400> 131 . 

acttaaatac acttacttta gctgtaccct acaacatgag agttattoac tttggtgctg 60 

gctctgataa aggagttgca ccaggtacag ctgtgctcag acaatggttg ccaactggca 120 

cactacttgt. cgattcagat cttaatgact tcgtctccga cgcagattct actttaattg 180 

gagactgtgc aacagtacat acggctaata aatgggacct tattattagc gatatgtatg 24 0 

accctaggac caaacatgtg acaaaagaga atgactctaa agaagggttt ttcacttatc 300 

tgtgtggatt tataaagcaa aaactagccc tgggtggttc tatagctgta aagataacag ' 360 

agcattcttg gaatgctgac ctttacaagc ttatgggcca tttctcatgg tggacagctt 420 

ttgttacaaa tgtaaatgca tcatcatcgg aagcattttt ■ aattggggct aactatcttg ' 480 

gcaagccgaa ■ 4 9q 

<210> 132 
<211> 530 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 132 

taaggagaat caaatcaatg atatgattta ttctcttctg gaaaaaggta ggcttatcat 60 

tagagaaaac aacagagttg tggtttcaag tgatattctt gttaacaacJt aaacgaaqat 120 

gtttattttc ^ttattatttc ttactctcac tagtggtagt gaccttgacc ggtgcaccac 180 

ttttgatgat gttcaagctc ctaattacac tcaacatact tcatctatga ggggggttta 240 

ctatcctgat gaaattttta gatcagacac tctttattta actcaggatt tatttcttcc 300 

attttattct aatgttacag ggtttcatac tattaatcat- acgtttggca accctgtcat 360 

accttttaag gatggtattt attttgctgc cacagagaaa tcaaatgttg tccgtggttg 420 

ggtttttggt tctaccatga acaacaagtc acagtcggtg attattatta acaattctac 480 

taatgttgtt atacgagcat gtaactttga attgtgtgac aaccctttct ttgctgtttc 54 0 

taaacccata " 550 

<210> 133 
<211> 490 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 133 

acttaaatac acttacttta gctgtaccct acaacatgag agttattcac tttggtgctg 60 

gctctgataa aggagttgca ccaggtacag ctgtgctcag acaatggttg ccaactggca 120 

cactacttgt cgattcagat cttaatgact tcgtctccga cgcagattct actttaattg 180 

gagactgtgc aacagtacat acggctaata aatgggacct tattattagc gatatgtatg 240 
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accctaggac caaacatgtg acaaaagaga .atgactctaa agaagggttt ttcacttatc 300 

tgtgtggatt tataaagcaa aaactagccc tgggtggttc tatagctgta aagataacag 360 

agcattcttg gaatgctgac ctttacaagc ttatgggcca tttctcatgg tggacagctt 420 

ttgttacaaa tgtaaatgca tcatcatcgg aagcattttt aattggggct aactatcttg 480 

gcaagccgaa 4 90 



<210> 134 

<211> 550 

<212> DNA 

<213> -Severe acute respiratory syndrome virus 



<400> 134 



taaggagaat 


caaatcaat g 


atatgattta 


ttctcttctg 


gaaaaaggta 


ggcttatcat 


60 


tagagaaaac 


aacagagttg 


tggtttcaag 


tgatattctt 


gttaacaact 


aaacgaacat 


120 


gtttattttc 


ttattatttc ttactctcac tagtggtagt 


gaccttgacc 


ggtgcaccac 


180 


ttttgatgat 


gttcaagctc 


ctaattacac 


tcaacatact 


tcatctatga 


ggggggttta 


240 


ctatcctgat 


gaaattttta 


gatcagacac tctttattta 


actcaggatt 


tatttcttcc 


300 


attttattct 


. aatgttacag 


ggtttcatac 


tattaatcat 


acgtttggca 


accctgtcat 


360 


accttttaag 


gatggtattt 


attttgctgc 


cacagagaaa 


tcaaatgttg 


tccgtggttg 


420 


ggttt^ttgg-t 




acaacaagtc 


acagtcggtg 


attattatta 


acaattctac 


480 


taatgttgtt 


atacgagcat 


gtaactttga 


attgtgtg.ac 


aaccctttct 


ttgctgtttc • 


540 


r 

taaacccata 












550 



<210> 135 

<211> 400 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 135 



atcaatgata tgatttattc tcttctggaa aaaggtaggc ttatcattag agaaaacaac 60 

agagttgtgg tttcaagtga tattcttgtt aacaactaaa cgaacatgtt tattttctta 120 

ttatttctta ctctcactag tggtagtgac cttgaccggt gcaccacttt tgatgatgtt 180 

caagctccta attacactca acatacttca tctatgaggg gggtttacta tcctgatgaa 240 

atttttagat cagacactct ttatttaact caggatttat ttcttccatt ttattctaat 300 

gttacagggt ttcatactat taatcatacg tttggcaacc ctgtcatacc ttttaaggat 360 

ggtatttatt ttgctgccac agagaaatca aatgttgtcc 400 



<210> 136 
<211> 288 
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<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 136 

tgatctttgc ttctccaatg tctatgcaga ttctttggta gtcaagggag atgatgtaag 60 

acaaatagcg ccaggacaaa ctggtgttat tgctgattat aattataaat tgccagatga 120 

tttcatgggt tgtgtccttg cttggaatac taggaacatt gatgctactt caactggtaa 180 

ttataattat aaatataggt atcttagaca tggcaagctt aggccctttg agagagacat 240 

atctaatgtg cctttctcca cctgatggca aaccttgcac cccacctg 288 



<210> 137 

<211> 411 
<212> • D^A 

<213> Severe acute respiratory syndrome virus 
<40Q> 137 

ctttgagaga gacatatcta atgtgccttt ctcccctgat ggcaaacctt gcaccccacc 60 
tgctcttaat tgttattggc cattaaatga ttatggtttt tacaccacta ctggcattgg 120 
ctaccaacct tacagagttg tagtactttc ttttgaactt ttaaatgcac cggccacggt 180 
ttgtggacca aaattatcca ctgaccttat taagaaccag tgtgtcaatt ttaattttaa 240 
tggactcact ggtactggtg tgttaactcc ttcttcaaag agatttcaac catttcaaca 300 
aattttgccg tga-tg-tt.-tct. ga^t.'tcact.g attccgttcg agatcctaaa acatctgaaa 360 
tattagacat ttcaccctgc gcttttgggg gtgtaagtgt aattacacct g . 411 ' 



<210> 138 
<211> 357 

<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 138 

tggaaatatt ttggtggttt taatttttca caaatattac ctgaccctct aaagccaact 60 
aagaggtctt ttattgagga cttgctcttt aataaggtga cactcgctga tgctggcttc 120 
atgaagcaat atggcgaatg cctaggtgat attaatgcta gagatctcat ttgtgcgcag 180 
aagttcaatg gacttacagt gttgccacct ctgctcactg atgatatgat tgctgcctac 24 0 
actgctgctc tagttagtgg tactgccact gctggatgga catttggtgc tggcgctgct 300 
cttcaaatac cttttgctat gcaaatggca tataggttca atggcattgg ^gttact 357 



<210> 139 

<211> 434 

<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 139 
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caatatggcg aatgcctagg tgatattaat gctagagatc tcatttgtgc gcagaagttc 60 

aatggactta cagtgttgcc acctctgctc actgatgata tgattgctgc ctacactgct 120 

gctctagtta gtggtactgc cactgctgga tggacatttg gtgctggcgc tgctcttcaa 180 

ataccttttg ctatgcaaat ggcatatagg ttcaatggca ttggagttac ccaaaatgtt " 240 

ctctatgaga accaaaaaca aatcgccaac caatttaaca aggcgattag tcaaattcaa 300 

gaatcactta caacaacatc aactgcattg ggcaagctgc aagacgttgt taaccagaat 360 

gctcaagcat taaacacact tgttaaacaa cttagctcta attttggtgc aatttcaagt 420 

gtgctaaatg atat 434 

<210> 140 
<211> 557 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 140 

acagacaata catttgtctc aggaaattgt gatgtcgtta ttggcatcat taacaacaca 60 . 

gtttatgatc ctctgcaacc tgagcttgac tcattcaaag aagagctgga caagtacttc 120 

aaaaatcata catcaccaga tgttgatctt ggcgacattt caggcattaa cgcttctgtc 180 

gtcaacattc aaaaagaaat tgaccgcctc aatgaggtcg c'taaaaattt aaatgaatca 240 

ctca1;tgaoc ttcaagaatt gggaaaatat gagcaatata ttaaatg'gcc ttggtatgtt 300 

tggctcggct tcattgctgg actaattgcc atcgtcatgg ttacaatctt gctttgttgc 360 

atgactagtt gttgcagttg cctcaagggt gcatgctctt gtggttcttg ctgcaagttt 420 

gatgaggatg actctgagcc agttctcaag ggtgtcaaat tacattacac ataaacgaac 480 

ttatggattt gtttatgaga ttttttactc ttagatcaat tactgcacag ccagtaaaaa 540 

ttgacaatgc ttctcct 557 

<210> 141 
<211> 530 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 141 

atgtttggct cggcttcatt gctggactaa ttgccatcgt catggttaca atcttgcttt 60 

gttgcatgac tagttgttgc agttgcctca agggtgcatg ctcttgtggt tcttgctgca 120 

agtttgatga ggatgactct gagccagttc tcaagggtgt caaattacat tacacataaa 180 

cgaacttatg gatttgttta tgagattttt tactcttaga tcaattactg cacagccagt 240 

aaaaattgac aatgcttctc ctgcaagtac tgttcatgct acagcaacga taccgctaca 300 

agcctcactc cctttcggat ggcttgttat tggcgttgca tttcttgctg tttttcagag 360 
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cgctaccaaa ataattgcgc tcaataaaag atggcagcta gccctttata agggcttcca 420 
gttcatttgc aatttactgc tgctatttgt taccatctat tcacatcttt tgcttgtcgc 480 
tgcaggtatg gaggcgcaat ttttgtacct ctatgccttg atatattttc 530 

<21D> 142 
<211> 320 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 142 

ttgctcgtac ccgctcaatg tggtcattca acccagaaac aaacattctt- ctcaatgtgc 60 
ctctccgggg gacaattgtg accagaccgc tcatggaaag tgaacttgtc attggtgctg 120 
tgatcattcg tggtcacttg cgaatggccg gacactccct agggcgctgt gacattaagg 180 
acctgccaaa agagatcact gtggctacat cacgaacgct ttcttattac aaattaggag 24 0 
cgtcgcagcg tgtaggcact gattcaggtt ttgctgcata caaccgctac cgtattggaa 300 
actataaatt aaatacagac 320 

<210> 143 
<211> 417 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 143 

cgaacttatg tactcattcg tttcggaaga aacaggtacg ttaatagtta atagcgtact 60 
tctttttctt gctttcgtgg tattcttgct agtcacacta gccatcctta ctgcgcttcg 120 
attgtgtgcg tactgctgca atattgttaa cgtgagttta gtaaaaccaa cggtttacgt 180 
ctactcgcgt gttaaaaatc tgaactcttc tgaaggagtt cctgatcttc tggtctaaac 2 40 
gaactaacta ttattattat tctgtttgga actttaacat tgcttatcat ggcagacaac 300 
ggtactatta ccgttgagga gcttaaacaa ctcctggaac aatggaacct agtaataggt 360 
ttcctattcc tagcctggat tatgttacta caatttgcct attctaatcg gaacagg 417 

<210> 144 
<211> 516 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 144 

cttgtcattg gtgctgtgat cattcgtggt cacttgcgaa tggccggaca ctccctaggg 60 
cgctgtgaca ttaaggacct gccaaaagag atcactgtgg ctacatcacg aacgctttct 120 
tattacaaat taggagcgtc gcagcgtgta ggcactgatt caggttttgc tgcatacaac 180 
cgctaccgta ttggaaacta taaattaaat acagaccacg ccggtagcaa cgacaatatt 24 0 
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gctttgctag tacagtaagt gacaacagat gtttcatctt gttga'cttcc aggttacaat 300 

agcagagata ttgattatca ttatgaggac tttcaggatt gatatttgga atcttgacgt 360 

tataataagt tcaatagtga gacaattatt taagcctcta actaagaaga attattcgga 420 

gttagatgat gaagaaccta tggagttaga ttatccataa aacgaacatg aaaattattc 4 80 

tcttcctgac attgattt-ta tttacatctt gcgagc 516 

<210> 145 
<211> 310 
<212> DNA 

<213> Severe acute respiratory syndrome virns 
<400> 145 

cgatgtttca tcttgttgac ttccaggtta caatagcaga gatattgatt atcattatga 60 
ggactttcag gattgctatt tggaatcttg acgttataat aagttcaata gtgagacaat 120 
tatttaagcc tctaactaag aagaattatt cggagttaga tgatgaagaa cctatggagt 180 
tagattatcc ataaaacgaa catgaaaatt attctcttcc tgacattgat tgtatttaca 240 
tcttgcgagc tatatcacta tcaggagtgt gttagaggta cgactgtact actaaaagaa" 300 
ccttgcccat 310 

<210> 14.6 

<211> 556 

<212> DNA 

<213> Severe acute respiratory syndrome virus 



<400> 146 
agaaagacag 


aatgaatgag 


ctcactttaa 


ttgacttcta 


tttgtgcttt 


ttagcctttc 


60 


tgctattcct 


tgttttaata 


atgcttatta 


tattttggtt 


ttcactcgaa 


atccaggatc 


120 


tagaagaacc 


ttgtaccaaa 


gtctaaacga 


acatgaaact 


tctcattgtt 


ttgacttgta 


180 


tttctctatg 


cagttgcata 


tgcactgtag tacagcgctg 


tgcatctaat 


aaacctcatg 


240 


tgcttgaaga 


tccttgtaag 


gtacaacact 


£i9999taata 


cttatagcac 


tgcttggctt 


300 


tgtgctctag 


gaaaggtttt 


accttttcat 


agatqgcaca 


ctatggttoa 


aacatgcaca 


360 


cctaatgtta 


ctatcaactg 


tcaagatcca 


gctggtggtg 


cgcttatagc 


taggtgttgg 


420 


taccttcatg 


aaggtcacca 


aactgctgca 


tttagagacg 


tacttgttgt 


tttaaataaa 


' 480 


cgaacaaatt 


aaaatgtctg 


ataatggacc 


ccaatcaaac 


caacgtagtg 


ccccccgcat 


540 


tacatttggt 


ggaccc 










556 



<210> 147 
<211> 110 
<212> DNA 
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<213> Severe acute respiratory syndrome virus 
<400> 147 

acgaacatga aaattattct cttcctgaca ttgattgtat ttacatcttg cgagctatat 60 
cactatcagg agtgtgttag aggtacgact gtactactaa aagaaccttg 110 



<210> 148 
<211> 363 
<212> DNi^ 

<213> Severe acute respiratory syndrome virus 
<400> 148 

gcatttagag acgtacttgt tgttttaaat aaacgaacaa attaaaatgt ctga-taatgg 
acctcaatca agccaacgta gtgccccccg cattacattt ggtggaccca cagattcaac 
tgacaataac cagaatggag gacgcaatgg ggcaaggcca aaacagcgcc gaccccaagg 
tttacccaat aatactgcgt cttggttcac agctctcact cagcatggca aggaggaact 
tagattccct cgaggccagg gcgttccaat caacaccaat agtggtccag atgaccaaat 
tggctactac cgaagagcta cccgacgagt tcgtggtggt gacggcaaaa tgaaagagct 
cag 



<210> 149 
<211> 294 
<212>' Dm 

<213> Severe acute respiratory syndrome virus 
<400> 149 

ctatcagctg cgtgcaagat cagtttcacc aaaacttttc atcagacaag aggaggttca 
acaagagctc tactcgccac tttttctcat tgttgctgct ctagtatttt taatactttg 
cttcaccatt aagagaaaga cagaatgaat gagctcactt taattgactt ctatttgtgc 
tttttagcct ttctgctatt ccttgtttta ataatgctta ttatattttg gttttcactc 
gaaatccagg atctagaaaa accttgtacc aaaggctaaa cgaacatgaa actt 



<210> 150 
<211> 504 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 150 

caaactgctg catttagaga cgtacttgtt gtttaaataa acgaacaaat taaaatgtct 60 
gataatggac cccaatcaaa ccaacgtagt gccccccgca ttacatttgg tggacccaca 120 
gattcaactg acaataacca gaatggagga cgcaatgggg caaggccaaa acagcgccga 180 
ccccaaggtt tacccaataa tactgcgtct tggttcacag ctctcactca gcat.ggcaag 24 0 
gaggaactta gattccctcg aggccagggc gttccaatca acaccaatag tggtccagat 300 
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gaccaaattg gctactaccg aagagctacc cgacgagttc gtggtggtga cggcaaaatg 360 

aaagagctca gccccagatg gtacttctat tacetaggaa ctggcccaga agctt.eactt 420 

ccctacggcg ctaacaaaga aggcatcgta tgggttgcaa ctgagggagc cttgaataca 480 

cccaaagacc acattggcac cncgt • 504 

<210> 151 
<211> 474 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 151 

ctcgccactt tttctcattg ttgctgctct agtattttta atactttgct tcaccattaa 60 

gagaaagaca gaatgaatga gctcacttta attgacttct atttgtgctt tttagccttt 120 

ctgctattcc ttgttttaat aatgcttatt atattttggt tttcactcga aatccaggat 180 

ctagaagaac cttgtaccaa agtctaaacg aacatgaaac "ttctcattgt tttgacttgt 24 0 

atttctctat gcagttgcat atgcactgta gtacagcgct gtgcatctaa taaacctcat 300 

gtgcttgaag atccttgtaa ggtacaacac taggggtaat acttatagca ctgcttggct 360 

ttgtgctcta ggaaaggttt taccttttca tagatggcac actatggttc aaacatgcac 420 

acctaatgtt ' actatcaact gtcaagatcc agctggtggt gcgfcttatag ctag 47 4 

<210> 152 

<211> 516 

<212> DNA 

<213> Severe acute respiratory syndTome virus 

<400> 152 



cattaagaga 


aagacagaat 


gaatgagctc 


actttaattg 


acttctattt 


gtgcttttta 


60 


gcctttctgc 


tattccttgt 


tttaataatg 


cttattatat 


tttggttttc 


actcgaaatc 


120 


caggatctag 


aagaaccttg 


taccaaagtc 


taaacgaaca 


tgaaacttct 


cattgttttg 


180 


acttgtattt 


ctctatgcag 


ttgcatatgc 


actgtagtac 


agcgctgtgc 


atctaataaa 


240 


cctcatgtgc 


ttgaagatcc 


ttgtaaggta 


caacactagg 


ggtaatactt 


atagcactgc 


300 


ttggctttgt 


gctctaggaa 


aggttttacc 


ttttcataga 


tggcacacta 


tggttcaaac 


360 


atgcacacct 


aatgttacta 


tcaactgtca 


agatccagct 


ggtggtgcgc 


ttatagctag 


420 


gtgttggtac 


cttcatgaag 


gt-caccaaac 


■tgctgcattt 


agagacgtac 


ttgttgtttt 


480 


aaataaacga 


acaaattaaa 


atgtctgata 


atggac 






516 



<210> 153 
<211> 451 
<212> DNA 
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<213> Severe acute respiratory syndrome virus 
<400> 153 



ccaaggttta 


cccaataata 


ctgcgtcttg 


gtt cacagct 


ctcactcagc 


atggcaaqQai 


60 


ggaacttaga 


ttccctcgag 


gccagggcgt 


tccaatcaac 


accaatagtg 


gtccagat ga 


120 


ccaaattggc 


tactaccgaa 


gagctacccg 


acgagttcgt 


ggtggtgacg 


gcaaaatgaa 


180 


agagctcagc 


cccagatggt 


acttctatta 


cctaggaact 


ggcccagaag 


cttcacttcc 


240 


ctacggcgct 


aacaaagaag 


gcatcgtatg 


ggttgcaact 


Qsgggagcct 


tgaatacacc 


300 


caaagaccac 


attggcaccc 


gcaatcctaa 


taacaatgct 


gccaccgtgc 


tacaacttcc 


360 


tcaaggaaca 


acattgccaa 


aaggcttcta 


cgcagaggga 


agcagaggcg 


y ^-'V>* y w V-* d d W 


420 


ctcttctcgc 


tcctcatcac 


gtagtcgcgg 


t 






451 


<210> 154 
<2il> 495 
<212> DNA 

<213> Severe acute respiratory syndrome virus 






<4 00> 154 
gatgaagctc 


agcctttgcc 


gcagagacaa 


aagaagcagc 


ccactgtgac 


t ctt ctt cc t 


60 


gcggctgaca 


tggatgattt 


ctccagacaa 


cttcaaaatt 


ccatgagtgg 


agcttctgct * 


120 


gattcaactc 


aggcataaac 


actcatgatg 


accacacaag 


gcagatgggc 


tatgtaaacg 


180 


ttttcgcaat 


tccgtttacg 


atacatagtc 


tactcttgtg 


cagaatgaat 


tctcgtaact 


240 


aaacagcaca 


agtaggttta 


gttaacttta 


atctcacata 


gcaat'cttta 


atcaatgtgt 


300 


aacattaggg 


aggacttgaa 


agagccacca 


cattttcatc 


gaggccacgc 


ggagtacgat 


360 


^gsgggtaca 


gtgaataatg 


ctagggagag 


ctgcctatat 


ggaagagccc 


taatgtgtaa 


420 


aattaatttt 


agtagtgcta 


tccccatgtg 


attttaatag 


cttcttagga 


gaatgacaaa 


480 


aaaaaaaaaa 


aaaaa 










495 



<210> 155 
<211> 512 
<212> DNA 

<213> Severe acute respiratory syndrome virus 
<400> 155 

acaaggccaa actgtcacta a'gaaatctgc tgctgaggca tctaaaaagc ctcgccaaaa 60 
acgtactgco acaaaacagt acaacgtcac tcaagcattt gggagacgtg gtccagaaca 120 
aacccaagga aatttcgggg accaagacct aatcagacaa ggaactgatt acaaacattg 180 
gccgcaaatt gcacaatttg ctccaagtgc ctctgcattc tttggaatgt cacgcattgg 240 
catggaagtc acaccttcgg gaacatggct gacttatcat ggagccatta aattggatga 300 



208 



wo 2004/096842 PCT/CA2004/000626 

caaagatcca caattcaaag acaacgtcat actgctgaac aagcacattg acgcatacaa 3 60 
aacattccca ccaacagagc ctaaaaagga caaaaagaaa aagactgatg aagctcagcc 420 
tttgccgcag agacaaaaga agcagcccac tgtgactctt cttcctgcgg ctgatatgga 480 
tgatttctcc agacaacttc aaaattccat ga 512 

<210> 156 
<211> 442 
<212> DNA 

<213> Severe acute respiratory syndrome virus 

<400> 156 

tgtgactctt cttcctgcgg ctgatatgga tgtttctcca gacaacttca aaattcoatg 60 
sgtggagctt etgctgattc aactcaggca taaacactca tgatgaccac acaaggcaga 120 
tgggctatgt aaacgttttc g'caattccgt ttacgataca tagtctactc ttgtgcagaa 180 
tgaattctcg taactaaaca gcacaagtag gtttagttaa ctttaatctc acatagcaat 240 
ctttaatcaa tgtgtaacat tagggaggac ttgaaagagc caccacattt tcatcgaggc 300 
cacgcggagt acgatcgagg gtacagtgaa taatgctagg gagagctgcc tatatggaag 360 
agccctaatg tgtaaaatta attttagtag tgctatcccc atgtgatttt aatagcttct 420 
taggagaatg acaaaaaaaa aa 442 

<210> 157 
<21,1> 24 
<212> mA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<4 00> ■ 157 

atgaattacc aagtcaatgg ttac 24 

<210> 158 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 158 

gaagctattc gtcacgttcg 20 

<210> 159 

<2ll> 22 

<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Primer 

<400> 159 

ctgtagaaaa tcctagctgg ag 22 

<210> 160 

<211> 21 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

<400> 160 

cataaccagt cggtacagct a 21 

<210> 161 

<211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 161 

ttatcacccg cgaagaagct * * 20 



<210:<»" 


162 


<211> 


22 


<212> 




<213>. 


Artificial Secjuence 


<220> 




<223> 


Primer 


<400> 


162 


ctctagttgc atgacagccc tc 


<210> 


163 


<211> 


24 


<212> 




<213> 


Artificial Sequence 


<220> 




<223> 


Primer 


<400> 


163 



22 



tcgtgcgtgg attggctttg atgt 24 



<210> 164 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Primer 
<400> 164 

gggttgggac tatcctaagt gtga 24 

<210> 165 

<211> 22 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 165 

taacacacaa acaccatcat ca 22 



<210> 


166 


<211> 


23 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Primer 


<400> 


166 


ggttgggact atcctaaqt^ tga 


<210> 


167 


<211^" 




<212> 


. DISIA 


<213> 


Artificial Secj^uervce 


<220> 




<223> 


Primer 


<400> 


167 


ccatcatcag atagaa-toat. cata 


<210> 


168 


<211> 


21 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Primer 


<400> 


168 


cctotcttg-t tcttgctcgc a 


<210> 


169 


<211> 


21 


<212> 


DNA 


<213> 


Artificial Sequence 


<220> 




<223> 


Primer 



23 



24 



21 
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<400> 169 

tatagtgagc cgccacacat g . 21 



<210> 170 

<211> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 



<:220> 

<221> inisc_f eature 

<222> [12) . , (12) 

<223> n is a, c, or t 

<400> 170 

taacacacaa cnccatcatc a 21 



<210> 171 

<211> 21 

<212> Dm 

<213> Artificial Sequence 



<220> 

<223> Primer 



<400> 171 

ctaacatgct taggataatd g 21 



<210> 172 

<211> 21 

<212> DNA 

<213> Artificial Secfuence 



<:220> 

<223> Primer 
<400> 172 

gcctctcttg ttcttgctcg c 21 , 



<210> 173 

<2ll> 21 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 173 

caggtaagcg taaaactcat c 21 



<2i0> 174 
<211> 17 
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<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 174 

tacacacctc agcgttg > 17 



PCT/CA2004/000626 



<210> 175 

<211> 16 

<212> DMA 

<213> Artl-ficial Sequence 
<220> 

<223> Primer 

<400> 175 

cacgaacgtg acgaat 16 



<210> 176 

<:211> 20 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer • 

<4 00> 176 

gccggagctc tgcagaattc 20 



<210> 177 

<211> 47 

<212> Dm 

<213> Artificial Sequence 

<220> 

<223> Primer 

<400> 177 

caggaaacag ctatgacttg catcaccact agttgtgcca ccaggtt 47 



<210> 178 

<211> 46 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 

<400> 178 

tgtaaaacga cggccagttg atgggatggg actatcctaa gtgtga 46 



<210> 179 
<211> 20 
<212> DNA 



213 



wo 2004/096842 



PCT/CA2004/000626 



<213> 



Artificial Sequence 



<220> 
<223> 



Primer 



<400> 179 

gcataggcag tagttgcatc 



20 



<210> 180 

<211> 8 

<212> PRT 

<213> Artificial Sequence 

<220> 

<223> ATP Binding Domain 



<220> 

<221> MISC_FEATURE 

<222> (1) • • (1) 

<223> Xaa ^ A or G 

<220> 

<221> misc_feature 

<222> (2) * • (5) 

<223> Xaa can be any naturally occurring amino acid 

<220> 

<221> MISC_FEATURE ' 

<222> (8) * , (8) I 

<223> Xaa = S or T 

<400> 180 

Xaa Xaa Xaa Xaa Xaa Gly "Lys Xaa 
1 5 



<2aO> 181 
<211> 23 
<212?> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 181 

Trp Tyr Val Trp Leu Gly Phe lie Ala Gly Leu lie Ala lie Val Met 
15 10 15 



Val Thr lie Leu Leu Cys Cys 



<210> 182 

<211> 16 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 182 



20 
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Met Asp Leu Phe Met Arg Phe Phe Thr Leu Arg Ser He Thr Ala Gin 
15 10 15 



<210> 183 
<211> 150 
<212> PRT 

<213> severe acute respiratory syndroms virus 
<400> 183 

Met Arg Cys Trp Leu Cys Trp Lys Cys Lys Ser Lys Asn Pro Leu Leu 
15 10 15 



Tyr Asp Ala Asn Tyr Phe Val Cys Trp His Thr His Asn Tyr Asp Tyr 
20 25 30 



Cys He Pro Tyr Asn Ser Val Thr Asp Thr lie Val Val Thr Glu Gly 
35 40 45 



Asp Gly He Ser Thr Pro Lys Leu Lys Glu Asp Tyr Gin He Gly Gly 
50 55 60 



Tyr Ser Glu Asp Arg His Ser Gly Val Lys Asp Tyr' Val Val Val His 
65 70 -75 80 



Gly yyr Phe Thr Glu Val Tyr Tyr Gin Leu Glu Ser Thr Gin He Thr 
85 90 95 



Thr Asp Thr Gly He Glu Asn Ala Thr Phe Phe He Phe Asn Lys Leu 
100 105 110 



Val Lys Asp Pro Pro Asn Val Gin He His Thr He Asp Gly Ser Ser 
. 115 120 125 



Gly Val Ala Asn Pro Ala Met Asp Pro He Tyr Asp Glu Pro Thr Thr 
130 135 140 



Thr Thr ser Val Pro Leu 
145 150 



<210> 184 
<211> 20 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 184 

Met Met Pro Thr Thr Leu Phe Ala Gly Thr His He Thr Met Thr Thr 
15 10 15 
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Val Tyr His lie 
20 



<210> 185. • . ' * • 

<211> 42 

<212> PRT . 

<213> Severe acute respiratory syndrome virus 

<400> 185 

Thr Ala Leu Arg Leu' Cys Ala Tyr Cys Cys Asn .lie Val Asn Val Ser 
1.5 10 15 



Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn Leu Asn 
20 , ' 25 30 



Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
35- 40 ' 



<210>. 18 6 

<211> 39 

<212> PRT ■ ' - . • * 

<213> Severe acute respiratory syndrome virus 

<400> 186 

- f " 

Met Ala Asp Asn Gly Thr. lie Thr Val Glu Glu Leu Lys Gin Leu Leu 
1/ . 5 . ' , ■ 10 ' 15 ■ 

Glu Gin Trp Asn Leu Val lie Gly Phe Leu Phe Leu Ala Trp lie. Met 
20 25 • 30 



Leu Leu Gin Phe Ala I'yr Ser 
35 



<210> 187 - • * 

<211> 100 

<212> PRT 

<213> Severe, acute respiratory syndrome virus 

<400> 187 

Pro Leu Arg Gly Thr lie Val Thr Arg Pro Leu Met Glu Ser Glu Leu 
1 5 10 15 



Val lie Gly Ala Val lie lie Arg Gly His Leu Arg Met Ala Gly His 
20 25 30 



Ser Leu Gly Arg Cys Asp lie Lys Asp Leu Pro Lys Glu lie Thr Val 
35 40 45 
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Ala Thr Ser Arg Thr Leu Ser Tyr Tyr Lys Leu Gly Ala Ser Gin Arg 
50 55 60 . 



Val Gly. Thr Asp Ser Giy Phe Ala Ala Tyr Asn Arg Tyr Arg lie Gly 
65 .70 75 80 



Asn Tyr Lys Leu Asn Thr Asp His Ala Gly Ser Asn Asp Asn He Ala 
85 , . 90 . 95 



Leu Leu Val Gin 
100 



<210> • 188 

<211> 23 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 188- 

Phe Tyr Leu Cys Phe Leu Ala Phe Leu Leu Phe Leu Val Leu He Met 
15 10 15 



Leu He He Phe Trp Phe Ser 
20 

<210> 18 9 . 

<211> 19 ■ 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 189 

Leu Leu He Val Leu Thr Cys He Ser Leu Cys Ser Cys He Cys Thr 
15 10 15 



Val Val Gin 



<210> 190 
<211> 24 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 190 

He Cys Thr Val Val Gin Arg Cys Ala Ser Asn Lys Pro His Val Leu 
15 10 15 



Glu Asp Pro Cys Lys Val Gin His 
20 
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<210>" 191 
<211> 22 
<212> PRT 

<213> Severe acute respiratory 
<400> 191 

Cys lie Cys Thr Val Val Gin Arg 

1 ■ 5 



Leu Glu Asp Pro Cys Lys 
20 



<210> 192 

<211> 22 

<212> PRT 

<213> Severe acute respiratory 

<400> 192 

Val Val Ala Val lie Gin Glu lie 
1 5 



lie Leu Leu Leu Glu Trp 
20 



syndrome virus 



Cys Ala Ser Asn Lys Pro His Val 
10 15 



syndrome virus 



Gin Leu Leu Ala Ala Val Gly Glu' 
10 15 



<210>' 
<211> 
<212> 



193 
19' 

DNA 



<213> Artificial Sequence 
<220> 

<223> Linker 

<400> 193. 
aattcgcggc cgcgtcgac 



<210> .194 

<211> 15 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Linker 



<400> 194 
gtcgacgcgg ccgcg 



19 



15 



<210> 195 

<211> 19 

<212> DNA 

<213> Artificial Sequence 
<220> 
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<223> Primer 

<400> . 195 • . ■ . 

aattcgcggc cgcgtcgac 19 

<210> ' 196 , 

<211> 19 . • • 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Primer 
<400> 196 

ggcctcttcg ctattacgc 19 

<210> 197 

<211> 21 

<212> DNA 

<213> Artificial Sequence 

<220> 

<223> Primer 

<400> i97 ' ' • 

tgcaggtcga ctctagagga t 21 

<210> 198 , ■ • ■ . I 

<211> 410 . 

<212> PRT 

<213> Avian infectious bronchitis virus 

<400> ,198 

Met Ala Ser Gly Lys Ala Ala Gly Lys Thr Asp Ala Pro Ala Pro Val 
1 5 10 15 

lie Lys Leu Gly Gly Pro Lys Pro Pro Lys Val Gly Ser Bex Gly Asn 
20 25 30 

Ala Ser Trp Phe Gin Ala He Lys Ala Lys Lys Leu Asn Thr Pro Pro 
35 40 . ■ 45 

Pro Lys Phe Glu Gly Ser Gly Val Pro Asp Asn Giu Asn He Lys Pro 
50 ■ 55 60 

Ser Gin Gin His Gly Tyr Trp Arg Arg Gin Ala Arg Phe Lys Pro Gly 
65 70 75' 80 

Lys Gly Gly Arg Lys Pro Val Pro Asp Ala Trp Tyr Phe Tyr Tyr Thr 
85 90 95 
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Gly Thr Gly Pro Ala Ala Asp Leu Asn Trp Gly Asp Thr Gin Asp Gly 
-100 105 • '* 110 



lie Val Trp Val Ala Ala Lys Gly Ala Asp Thr Lys Ser 'Arg Ser Asn 
115 120 125 



Gin Gly Thr Arg Asp Pro Asp Lys Phe Asp Gin Tyr Pro Leu Arg Phe' 
130 135 140 



Ser Asp Gly Gly Pro Asp Gly Asn Phe Arg Trp Asp ' Phe Jle Pro .Leu 
145 ,150- 155 160 



Lys Asn Arg Gly Arg Ser Gly Arg Ser Thr Ala Ala Ser Ser Ala Ala 
165 170 175 



Ala Ser Arg Ala Pro Ser Arg Glu Gly Ser Arg Gly Arg Arg Ser Asp 
180 185 190 • 



Ser Gly Asp Asp Leu lie Ala Arg Ala Ala Lys lie lie Gin Asp Gin 
195 200 205 



Glh Lys Lys Gly 'Ser Arg lie Thr Lys Ala Lys Ala Asp Glu Met Ala 
210 215 220 



His Arg Arg Tyr Cys Lys Arg Thr lie Pro- Pro Asn Tyr Arg Val Asp 
225 23.0 235 240 



Gin Val Phe Gly Pro Arg Thr Lys Gly Lys Glu Gly Asn Phe Gly Asp 
245 250 255 



Asp Lys Met Asn Glu Glu Gly lie Lys Asp Gly Arg Val Thr Ala Met 
260 265 270 



Leu Asn Leu Val Pro Ser Ser His Ala Cys Leu Phe Gly Ser Arg Val 
275 280 285 



Thr Pro Lys Leu Gin Leu Asp Gly Leu His Leu Arg Phe Glu Phe Thr 
290 295 300 



Thr Val Val Pro Cys Asp Asp Pro Gin Phe Asp Asn Tyr Val Lys lie 
305 310 315 320 



Cys Asp Gin Cys Val Asp Gly Val Gly Thr Arg Pro Lys Asp Asp Glu 
325 330 335 



Pro Lys Pro Lys Ser Arg Ser Ser Ser Arg Pro Ala Thr Arg Gly Asn 
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340 345 350 



Ser Pro. Ala Pro Arg Gin Gin Arg Pro Lys Lys Glu Lys Lys Leu Lys 
355'. 360 365 



Lys Gin Asp Asp Glu Ala Asp Lys Ala Leu Thr Ser Asp Glu Glu Arg 
370 375 380 



Asn Asn Ala Gin Leu Glu Phe Tyr Asp Glu Pro Lys Val lie Asn Trp 
385 ♦ 390 395 400 



Gly Asp Ala Ala Leu Gly Glu Asn Glu Leu 
405 410 



<210> 199 , 

<211> 30 

<212> PRT- 

<213> conotoxin 

<400> 199 

Val Gly Gin Leu Cys Val Phe Trp Asn lie Gly Arg Pro 

5 • 10 . 15 • 



Cys Cys Ser Gly Leu Cys Val Phe Ala Cys Thr Val Lys Leu 
20 25 " '30 



200 • . ' 

31 
PRT 

Severe acute respiratory . syndrome virus 

<400> 200 ' ■ • 

Leu Cys Ser Cys lie Cys Thr Val Val Gin Arg Cys Ala 
5 10 * 15 



Ser Asn Lys Pro His Val Leu Glu Asp Pro Cys Lys* Val Gin His 
20. 25 " .30 



201 
310 
DNA 

Severe acute respiratory syndrome virus 
<400> 201 

cgatgtttca tcttgttgac ttccaggtta caatagcaga gatattgatt atcattatga 60 
ggactttcag gattgctatt tggaatcttg acgttataat aagttcaata gtgagacaat 120 
tatttaagcc tctaactaag aagaattatt cggagttaga tgatgaagaa cctatggagt 180 



Cys lie Ala 
1 



<210> 
<211> 
<212> 
<213> 



Cys lie Ser 
1 



<210> 
<211> 
<212> 
<213> 
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tagattatcc ataaaacgaa catgaaaatt attctcttcc tgacattgat tgtatttaca 240 

tcttgcgagc tatatcacta tcaggagtgt gttagaggta cgactgtact actaaaagaa ' 300 

ccttgcccat ^IQ 

<210> 202 

<211> 556 

<212> DNA- 

<213> Severe acute respiratory syndrome virus 

<400> 202 



agaaagacag 


aatgaatgag 


ctcactttaa 


ttgacttcta 


tttgtgcttt 


ttagcctttc 


60 


tgctattcct 


tgttttaata 


atgcttatta 


tattttggtt 


ttcactcgaa 


atccaggatc 


120 


tagaagaacc 


ttgtaccaaa 


gtctaaacga 


acatgaaact 


tctcattgtt 


ttgacttgta 


180 


tttctctatg 


cagttgcata 


tgcactgtag 


tacagcgctg tgcatctaat 


aaacctcatg 


240 


tgcttgaaga 


tccttgtaag 


gtacaacact 


aggggtaata 


cttatagcac 


tgcttggctt 


300 


tgtgctctag 


gaaaggtttt 


accttttcat 


agatggcaca 


ctatggttca 


aacatgcaca 


3i50 


cctaatgtta 


ctatcaactg 


tcaagatcca 


gctggtggtg 


cgcttatagc 


taggtgttgg 


420 


taccttcatg 


aaggtcacca 


aactgctgca 


tttagagacg 


tacttgttgt 


tttaaataaa 


480 


cgaacaaatt 


aaaatgtctg 


ataatggacc 


ccaatcaaac 


c'aacgtagtg 


ccccccgcat 


540 


taca^ttggt 


ggaccc 










556 



<210> 203 

<211> 1255 

<212> PRT 

<213> Severe acute respiratory syndrome virus 

<400> 203 

Met Phe lie Phe Leu Leu PheLeu Thr Leu Thr Ser Gly Ser Asp. Leu 
1 5 10 15 



Asp Arg Cys Thr Thr Phe Asp Asp Val Gin Ala Pro Asn Tyr Thr Gin 
20 25 30 



His Thr Ser Ser Met Arg Gly Val Tyr Tyr Pro Asp Glu He Phe Arg 
35 40 45 



Ser Asp Thr Leu Tyr Leu Thr Gin Asp Leu Phe Leu Pro Phe Tyr Ser 
50 55 60 



Asn Val Thr Gly Phe His Thr He Asn His Thr Phe Gly Asn Pro Val 
65 70 75 80 
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lie Pro Phe Lys Asp Gly He Tyr Phe Ala Ala Thr Glu Lys Ser Asn 
85 ■ 90 ■ ■ • 95 

Val Val Arg Gly Trp Val Phe. Gly Ser Thr Met Asn Asn '..Lys Ser Gin 
, 100 105 . 110 

Ser Val lie He He Asn Asn Ser Thr Asn Val Val He Arg Ala Cys" 
115 120 - 125 

Asn Phe Glu Leu Cys Asp Asn Pro Phe Phe Ala Val' Ser Lys Pro Met 
130 .135 140 

Gly Thr Gin Thr His Thr Met He Phe Asp Asn Ala Phe Asn Cys Thr 
145 150 155 160 

Phe Glu Tyr He Ser Asp Ala Phe Ser Leu Asp Val Ser Glu Lys Ser 
165 170. 175 

Gly Asn Phe Lys His Leu Arg Glu Phe Val Phe Lys Asn Lys Asp Gly ■ 
180 185 190 

Phe Leu Tyr Val Tyr Lys Gly Tyr Gin 'Pro He Asp Val Val Arg Asp • 
195 , . 200 205 

\ . .. 

Leu Pro Ser Gly Phe Asn Thr Leu Lys Pro He Phe Lys Leu Pro Leu • 
210 215 ■ . 220 

Gly He Asn He Thr Asn Phe Arg Ala He Leu Thr Ala Phe Ser Pro 
225 ' 230 235 240 

Ala Gin Asp He Trp Gly Thr Ser Ala Ala Ala Tyr Phe Val Gly Tyr 
245 250 255 " 

Leu Lys Pro Thr Thr Phe Met Leu Lys Tyr Asp Glu Asn Gly Thr He 
260 265 270 

Thr Asp Ala Val Asp Cys Ser Gin Asn Pro Leu Ala Glu Leu Lys Cys 
275 280 • 285 



Ser Val Lys Ser Phe Glu He Asp Lys Gly He Tyr Gin Thr Ser Asn 
290 295 300 



Phe Arg Val Val Pro Ser Gly Asp Val Val Arg Phe Pro Asn- He Thr 
305 310* 315 320 



Asn Leu Cys Pro Phe Gly Glu Val Phe Asn Ala Thr Lys Phe Pro Ser 
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325 330 335 



Val Tyr Ala Trp Glu Arg hys Lys lie Ser Asn Cys Val Ala Asp Tyr 
340 345 . 350 



Ser Val Leu Tyr Asn Ser Thr Phe Phe Ser Thr Phe Lys Cys Tyr Gly 
355 360 365 



Val Ser Ala Thr Lys Leu Asn Asp Leu Cys Phe Ser Asn Val Tyr Ala 
370 . 375 380 



Asp Ser Phe Val Val Lys Gly Asp Asp Val Arg Gin lie. Ala Pro Gly 

385 390 395 400 

Gin Thr Gly Val He Ala Asp Tyr Asn Tyr Lys Leu Pro Asp Asp Phe 

405 410 415 



Met Gly Cys Val Leu Ala Trp Asn Thr Arg Asn He Asp Ala Thr Ser 
420 425 430 



Thr Gly Asn Tyr Asn Tyr Lys Tyr Arg Tyr Leu Arg His Gly Lys Leu 
435 440 445 



Arg Pro Phe Glu Arg Asp He Ser Asn Val Pro Phe Ser Pro Asp Gly 
450 455 460 



Lys Pro Cys Thr Pro Pro Ala Leu Asn Cys Tyr Trp Pro Leu Asn Asp 
465 470 475 480 

Tyr Gly Phe Tyr Thr Thr Thr Gly He Gly Tyr Gin Pro Tyr Arg Val 
485 490 495 



Val Val Leu Ser Phe Glu Leu Leu Asn Ala Pro Ala Thr Val Cys Gly 
500 505 510 



Pro Lys Leu Ser Thr Asp Leu He Lys Asn Gin Cys Val Asn Phe Asn 
515 520 525 



Phe Asn Gly Leu Thr Gly Thr Gly Val Leu Thr Pro Ser Ser Lys Arg 
530 535 540 



Phe Gin Pro Phe Gin Gin Phe Gly Arg Asp Val Ser Asp Phe Thr Asp 
545 550 555 560 



Ser Val Arg Asp Pro Lys Thr Ser Glu He Leu Asp He Ser Pro Cys 
565 570 575 
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Ala Phe Gly Gly Val Se.r Val lie Thr Pro Gly Thr 'Asn Ala Ser Ser 
580 585 590 



Glu Val Ala Val Leu Tyr Gin Asp Val Asn Cys Thr Asp Val Ser Thr 
595 600 605 



Ala lie His Ala Asp Gin Leu Thr Pro Ala Trp Arg lie Tyr Ser Thr 
610 615 620 , 



Gly Asn Asn Val Phe Gin Thr Gin Ala Gly Cys Leu. lie Gly Ala Glu 
625 . 630 635 640' 



His Val Asp Thr Ser Tyr Glu Cys Asp lie Pro lie Gly Ala Gly He 
645 650 655 



Cys Ala Ser Tyr His Thr Val Ser Leu Leu Arg Ser Thr Ser Gin Lys 
560 665- 670 



Ser He Val Ala Tyr Thr Met Ser Leu Gly Ala Asp Ser Ser lie Ala 
675 680 • 685 



Tyr Ser Asn Asn Thr He Ala He Pro Thr Asn .Phe Ser He Ser He 
'^90 695 700 



Thr Thr Glu Val Met Pro Val Ser Met Ala Lys Thr Ser Val Asp Cys 
705 710 ' 715 720 



^sn Met .Tyr He. Cys Gly Asp Ser Thr Glu Cys Ala Asn Leu Leu Leu- 
725 730 735 . 



Gin Tyr Gly Ser Phe Cys Thr Gin Leu Asn Arg Ala Leu Ser Gly He 
740 745 750 



Ala Ala Glu Gin Asp Arg Asn" Thr Arg Glu Val Phe Ala Gin Val Lys 
755 760 765 



Gin Met Tyr Lys Thr Pro Thr Leu Lys Tyr Phe Gly Gly Phe Asn Phe 
770 775 780 



Ser Gin He Leu Pro Asp Pro Leu Lys Pro Thr Lys Arg Ser Phe He 
785 790 795 800 



Glu Asp Leu Leu Phe Asn Lys Val Thr Leu Ala Asp Ala Gly Phe Met 
805 810 815 
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Lys Gin Tyr Gly Glu Gys Leu Gly Asp lie Asn Ala Arg Asp Leu lie 
820 825 830 



Cys Ala- Gin Lys Phe Asn Gly Leu Thr Val Leu Pro Pro Leu Leu Thr 
835 , 840 845 



Asp Asp Met He Ala Ala Tyr Thr Ala Ala Leu Val Ser Gly Thr Ala 
850 855 860 



Thr Ala Gly Trp Thr Phe Gly Ala Gly Ala Ala Leu Gin lie Pro Phe 
865 870 875 880 



Ala Met Gin Met Ala Tyr. Arg' Phe Asn Gly He Gly Val Thr Gin Asn 
885 890 895 



Val Leu Tyr Glu Asn Gin Lys Gin He Ala Asn Gin Phe Asn Lys Ala 
900 , 905 910 



He Ser Gin He Gin Glu Ser Leu Thr Thr Thr Ser Thr Ala Leu Gly 
915 920 925 



Lys Leu Gin Asp Val Val Asn Gin Asn Ala Gin Ala Leu Asn Thr Leu 
930 935 • 1 940 



Val Lys Gin Leu Ser Ser Asn Phe Gly Ala He Ser Ser Val Leu Asn 
945 950 955 960 



Asp He Leu Ser Arg Leu Asp Lys Val Glu Ala Glu Val Gin He Asp 
965 970 975 



Arg Leu He Thr Gly Arg Leu Gin Ser Leu Gin Thr Tyr Val Thr Gin 
980 985 990 



Gin Leu He Arg Ala Ala Glu He Arg Ala Ser Ala Asn Leu Ala Ala 
995 1000 1005 



Thr Lys Met Ser Glu Cys Val Leu Gly Gin Ser Lys Arg Val Asp 
1010 1015 1020 



Phe Cys Gly Lys Gly Tyr His Leu Met Ser Phe Pro Gin Ala Ala 
1025 1030 1035 



Pro His Gly Val Val Phe Leu His Val Thr Tyr Val Pro Ser Gin 
1040 1045 1050 
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Glu Arg Asn Phe Thr Thr Ala Pro" Ala He Cys His Glu Gly Lys 
1055 . ' 1060 - . 1065 



Ala Tyr Phe Pro Arg Glu Gly Val ' Phe Val Phe Asn Gly Thr Ser 
1070 1075 1080 



Trp Phe He Thr Gin Arg Asn Phe Phe Ser Pro Gin He He Thr 
1085 .1090 1095 



Thr Asp Asn Thr Phe Val Ser Gly Asn Cys Asp Val Val He Gly 
1100 . 1105 1110 '* 



He He Asn Asn Thr Val Tyr Asp Pro Leu Gin Pro Glu Leu Asp 
1115 1120 1125 



Ser Phe Lys Glu Glu Leu Asp Lys Tyr Phe Lys Asn His Thr Ser 
* 1130 1135 1140 



Pro Asp Val Asp Leu Gly Asp He Ser Gly He Asn Ala Ser Val 
1145 1150 1155 



Val Asn He Gin Lys Glu He Asp Arg Leu Asn Glu Val Ala Lys 
1160 , 1165 • 1170 



Asn Leu Asn Glu Ser Leu He Asp Leu Gin Glu Leu Gly Lys Tyr 
1175 1180 1185 



Glu Gin Tyr He Lys Trp Pro Trp Tyr Val Trp Leu Gly Phe He • 
1190 1195 1200 



Ala Gly Leu He Ala He Val Met Val Thr He Leu Leu Cys Cys 
1205 1210 • 1215 



Met. Thr Ser Cys Cys Ser Cys Leu Lys Gly Ala Cys Ser Cys Gly 
1220 1225 ' 1230 



Ser Cys Cys Lys Phe Asp Glu Asp Asp Ser Glu Pro Val Leu Lys 
1235 1240 . • 1245 



Gly Val Lys Leu His Tyr Thr 
1250 1255 



<210> 204 

<211> 422 

<212> PRT 

<213> Severe acute respiratory syndrome virus 
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Met Ser Asp Asn Gly Pro Gin Ser Asn Gin Arg Ser Ala Pro Arg lie 
1-5 10 15 

Thr Phe Gly Gly P,ro Thr Asp Ser Thr Asp Asn Asn Gin Asn Gly Gly 
20 '25 30 * 

Arg Asn Gly Ala Arg Pro Lys Gin Arg Arg Pro Gin Gly Leu Pro Asn 
35 40 45 



Asn Thr Ala Ser Trp Phe Thr Ala Leu Thr Gin His Gly Lys Glu Glu 
50 55 60 



Leu Arg Phe Pro Arg Gly -Gin Gly Val Pro lie Asn Thr Asn Ser Gly 

S5 70 • 75 80 

Pro Asp Asp Gin He Gly Tyr Tyr Arg Arg Ala Thr Arg Arg Val Arg 

85 90 95 



Gly Gly Asp Gly Lys jyiet Lys Glu Leu Ser Pro Arg Trp Tyr Phe Tyr 
100 105 110 

Tyr Leu Gly Thr Gly Pro Glu Ala Ser Leu Pro Tyr Gly Ala Asn Lys 
115 120 * 125 



Glu Gly -He Val Trp Val Ala Thr Glu Gly Ala Leu Asn Thr Pro Lys 
130 135 140 



Asp His He Gly Thr Arg Asn Pro Asn Asn Asn Ala Ala Thr Val Leu 
145 150 " 155 160 

Gin Leu Pro Gin Gly Thr Thr Leu Pro Lys Gly Phe Tyr Ala Glu Gly 
165 170- 175 ■ 



Ser Arg Gly Gly Ser Gin Ala Ser Ser Arg Ser Ser Ser Arg Ser Arg 
180 185 190' 



Gly Asn Ser Arg Asn Ser Thr Pro Gly Ser Ser Arg Gly Asn Ser Pro 
195 200 205 



Ala Arg Met Ala Ser Gly Gly Gly Glu Thr Ala Leu Ala Leu Leu Leu 
210 215 220 



Leu Asp Arg Leu Asn Gin Leu Glu Ser Lys Val Ser Gly Lys Gly Gin 
225 230 235 240 
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Gin Gin Gin Gly Gin Thr Val Thr Lys Lys Ser Ala Ala Glu Ala Ser 
245 250- 255 • 



Lys Lys- Pro Arg Gin Lys Arg Thr Ala Thr Lys Gin Tyr Asn Val Thr 
260 / 265 270 



Gin Ala Phe Gly Arg Arg Gly Pro Glu Gin Thr Gin Gly Asn Phe Gly 
275 280 285 



Asp Gin Asp Leu lie Arg Gin Gly Thr Asp Tyr Lys His Trp Pro Gin 
■ 290 295 300 



lie Ala Gin Phe Ala Pro Ser Ala Ser Ala Phe Phe Gly Met Ser Arg 
.305 310 315 320 



lie Gly Met Glu Val Thr Pro Ser Gly Thr Trp Leu Thr Tyr His Gly 
325 330 335, 



Ala. lie Lys Leu Asp Asp Lys Asp Pro Gin Phe Lys Asp Asn Val lie 
340 345 350 



Leu Leu Asn Lys His lie Asp Ala Tyr Lys Thr Phe Pro Pro Thr Glu 
355 360 1 365 



Pro Lys Lys Asp Lys Lys Lys Lys Thr Asp Glu Ala Gin Pro Leu Pro 
370- 375 380 ■ 



Gin Arg Gin Lys Lys Gin Pro Thr Val Thr Leu Leu Pro Ala Ala Asp 
385 390 395 400 



Met Asp Asp Phe Ser Arg Gin Leu Gin Asn Ser Met Ser Gly Ala Ser 
405 410 415 



Ala Asp Ser Thr Gin Ala 

420 



<210> 205 

<211> 221 

<212> PRT 

<213> Sars associated coronavirus 

<400> 205 

Met Ala Asp Asn Gly Thr lie Thr Val Glu Glu Leu Lys Gin Leu Leu 
1 5-10 15 



Glu Gin Trp Asn Leu Val He Gly Phe Leu Phe Leu Ala Trp He Met 
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Leu Leu Gin Phe Ala Tyr Ser Asn Arg Asn Arg Phe Leu Tyri He He 
35 40 45 



Lys Leu Val Phe Leu Trp Leu Leu Trp Pro Val Thr Leu Ala ' Cys Phe 
50 55 60 



Val Leu Ala Ala Val Tyr Arg He Asn Trp yal Thr Gly Gly He Ala 
65 70 75 80 



He Ala Met Ala Cys He Val Gly Leu Met Trp Leu Ser Tyr Phe Val 
85 ■ 90 95 



Ala Ser Phe Arg Leu Phe Ala Arg Thr Arg Ser Met Trp Ser Phe Asn 
100 105 110 



Pro Glu Thr Asn He Leu Leu Asn Val Pro Leu Arg Gly Thr He Val 
115 120 125 



Thr Arg Pro Leu Met Glu Ser Glu Leu Val He Gly Ala Val He He 
130 135 . 140 



Arg Gly His Leu Arg Met Ala Gly His Ser Leu Gly Arg Cys Asp He 
145 150 155' 160 



Lys Asp Leu Pro Lys Glu He Thr Val Ala Thr Ser Arg Thr Leu Ser 
165 170 175 



Tyr Tyr Lys Leu Gly Ala Ser Gin Arg Val Gly Thr Asp Ser Gly Phe 
180 185 .190 



Ala Ala Tyr Asn Arg Tyr Arg He Gly Asn Tyr Lys Leu Asn Thr Asp 
195 200 205 



His Ala Gly Ser Asn Asp Asn He Ala Leu Leu Val Gin 
210 215 220 



<210> 206 
<211> 76 
<212> PRT 

<213> Severe acute respiratory syndrome virus 
<400> 206 

Met Tyr Ser Phe Val Ser Glu Glu Thr Gly Thr Leu He Val Asn Ser 
15 10 15 
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Val Leu Leu Phe Leu Ala Phe Val Val Phe Leu Leu Vai Thr Leu Ala 
20 • • ' 25 . ■ . 30 



lie Leu. Thr Ala Leu Arg Leu Cys Ala Tyr Cys Cys Asn ' lie Val Asn 
' 35 . 40 • 45 



Val Ser Leu Val Lys Pro Thr Val Tyr Val Tyr Ser Arg Val Lys Asn 
50 55 60 



Leu Asn Ser Ser Glu Gly Val Pro Asp Leu Leu Val 
65 70 .75 
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